Unix software technologies; constituent pieces and underlying concepts

William Safire "Knowing how things work is the basis for appreciation, and is thus a source of civilized delight."

Introduction In the section above, , we've explained what does the system do to get from a power-on to some usable state. By now, you should have also learned how to log in, of course, and wander around the system a little. The system you see, however, is very much "alive", it's not just a collection of commands and files waiting to be ran or read. Those "live" parts (or subsystems) are crucial to Unix and account for a lot of what Unix stands for. We are about to poke around the neighborhood and meet the crowd.

System Login Under the term system login, we assume an action of verifying one's credentials, setting up access rights, and letting users proceed with their computer session. Exactly how does that session looks like, depends on the actual service requested and the type of the users' client software. In general, users are given individual accounts, to which they can log-in. There are two main groups of accounts: System accounts - accounts that are registered on a system level, usually in files /etc/passwd, /etc/group and /etc/shadow. Mentioned files form the traditional Unix users authentication scheme, although such information can also be kept at various databases, for example in so-called directories which consist of key:value pairs and are optimized for massive read-only access ( LDAP). System accounts are service-independent and deeply rooted in Unix philosophy. One of their key values is full accountability in terms of dates and times of access, performed actions and system resources used. Typical examples are the accounts you use to access all telnet, SSH and FTP services. Those "real" accounts will be of our primary concern, and we shall refer to them as system accounts or simply accounts. Virtual accounts - accounts that are not registered on a system level, and instead live in service-specific databases. Those databases could be based on files or LDAP behind the scenes as well, but because virtual account solutions are popular for simplicity and ad-hoc setup (except for few notable implementations), most of them today seem to live in MySQL databases. Typical examples of virtual accounts in use are various Web shops, Web memberships, mailing list membership or "inventions" like e-mail-over-web. Virtual accounts have also been fairly popular in setups where users do access their e-mail using proper protocols, but only have "virtual mailboxes" on the servers instead of real accounts. As we've mentioned, virtual accounts are mostly service-dependent and are, lacking any formality in both design and implementation phase, inherently inconvenient to account for. Instead of re-using the established system infrastructure, applications must handle virtual users in their own ways. In addition, instead of performing tasks under the appropriate system accounts' privileges, such applications run under a single username, further complicating any deeper access control and usage statistics. We see how the computing word around us has changed over the past 10 years (for better or worse). Almost everything is nowadays in some form of virtual accounts, this is virtual, that is virtual, everything is virtual! Actually, this is too funny — here's what I wrote about virtual accounts in this Guide back in 2002: "Virtual accounts are a disaster (except, again, for few notable fiels of use and implementations) and have blossomed since 1995 onwards, the period that was characterized by the advent of 'personal computers' and the disappearance of all technical correctness from the civil sector."

Console login Probably the most straight-forward way to log-in to the system is to sit between keyboard and chair, and log-in at the local system's console prompt. In general, a variant of the getty program will be listening on the consoles to receive your authentication info. Debian default, /sbin/getty, was spawned by /sbin/init. /sbin/init, in turn, took its configuration from the already-mentioned /etc/inittab file. There are many getty variants available (try running apt-cache search --names-only getty for example) but "getty" has also been established as a kind of a generic name for the whole class. It's interesting to note that getty reads in the initial username and password, and pulls out of the deal by passing control onto the /bin/login program. The question is, however, what happens if you type in a wrong username or password (or do not authenticate successfully for some other reason)? Since getty is out of the game, the login program itself will serve you with another prompt, although it will look exactly the same as the original getty one. Only if you fail to authenticate for a couple of times in a row, or terminate your session, will /bin/login close down and (thanks to init) /sbin/getty be respawned ("started again") to wait for new logins. Determining who's behind the login prompt; <command>/sbin/getty</> or <command>/bin/login</>? If you press getty. Otherwise, such as if there is a timeout first, you're talking to /bin/login.

The 'login' shell Supposing you manage to authenticate successfully and the getty or login programs let you through, what happens next? Well, before we can answer that, you first need to get familiar with your entry in the /etc/passwd file. There are many ways to retrieve it; you could open the file in a text editor and search for your username, you could run grep $USER /etc/passwd, and you could run getent passwd $USER. The last variant is suggested as it can work with arbitrary user authentication scheme. A sample entry might look like this: $ getent passwd $USER mirko:x:1000:1000:Mirko,,,:/home/mirko:/bin/bash Fields 6, 7 and 8 specify users' &LNK2; information, their home directory, and the default shell. Generally, after you have been authenticated, the software spawns the specified shell for you and changes to your home directory. Since Unix sites and users often configure their environment, there's are global tuning files available, /etc/profile and /etc/environment (the first is an executable script, the other is a collection of KEY=VALUE pairs and does not exist by default). The bash shell also reads /etc/bash.bashrc and possibly other /etc/bash* files (if configured to include them). After the site-wide configuration files are honored, the shell reads its corresponding user-specific dotfiles at startup. Again, in case of the bash shell, those are ~/.bash_profile or ~/.bashrc. At this point, it is important to learn the difference between login- and non-login shells. Login shells are the special case where users are at the other end of a connection (instead of a batch script file or another program) and use the terminal interactively. When you log-in to the system using telnet or SSH, you're given a login shell. Login shells read ~/.bash_profile, which should contain settings relevant for interactive work (command aliases, prompt display, etc.). All other shells are non-login shells. The root user does not read /etc/profile file and, by Debian convention, its dotfile is ~/.profile instead of ~/.bash_profile (but this is in no way enforced - if ~/.bash_profile was present, it would take precedence). We could mention that the "shell language" was standardized by POSIX, so any shell files that are not bashisms. And indeed, there's a strong movement present in Debian to free the maintainer scripts of all non-POSIX-compliant constructs. Following the analogy, your root user's ~/.profile file should be written with POSIX sh standard in mind. Korn Shell (ksh) Programming page for additional information. It is also useful to note that the shell does not use any secret techniques to read the dotfiles; it evaluates them in the context of the existing process using the source or . (a dot) command. When those "startup" tasks are performed, the system shows the command prompt and is ready to accept commands.

Account login regulation Since most of the accounts on your machine will be used locally, by yourself, there's no reason to let people log in remotely, right? You could be interested in giving your friends access, but that's a different issue — you would give them their own accounts and take some basic precautions before opening the system to the World. This is all pretty hard to explain right now, because it already touches that magic World of Unix security, which is so broad and deverse that any immediate commentary on it would distract us noticeably, even if we ignored the "intuitive" thinking and stuck to formal definition. So anyway, as we concluded we don't want people logging in remotely, edit file /etc/security/access.conf, read short introductory text included in the file, and add something like this to the end: -:root mirko ante:ALL EXCEPT LOCAL The above would deny access to root, mirko or ante, except from the localhost. Settings in the /etc/security/access.conf file are honored because the PAM subsystem can be configured to read them, as we'll just see explained in the next section.

Pluggable Authentication Modules (PAM) So far, you should have understood that, in Unix, there are many data protocols (FTP, HTTP, telnet, IRC, ...) and their implementations (vsftpd, Apache, telnetd, dancer-ircd, ...). Since most of the services require user authentication, it becomes obvious that supporting all kinds of authentication in every service would be hard, require a lot of manual and repetitive work, and be error-prone. On top of that, implementations would most probably end up being inconsistent, having different interpretations of "standard", and contain suble, hard-to-find bugs. Fortunately, computer science is old enough that people came about to spot the problem, and think about eventual remedies. The idea that &SUN; came up with was a generic Pluggable Authentication Module layer, or simply — PAM. Generally, each service makes a straightforward call to PAM and expects a Yes or No type of answer. This allows for one size fits all approach in client software; to perform all authentication work, simply invoke PAM and don't worry. Even though PAM only returns a positive or a negative final answer, one could suppose that PAM uses more sophisticated techniques in reaching this boolean (Yes/No) conclusion. And indeed it is so. Each service drops a piece of its PAM configuration to PAM config files. That configuration can request arbitrary authentication steps to be performed, combined and stacked in any order (including either-or variants). There's also a default which you can use to handle multiple services with the same config file. For example, you could configure PAM to authenticate the user if either his retinal scan matches the database, or he posseses both the correct RSA private key and a one-time password. And supporting arbitrary other authentication scheme becomes as easy as writing a PAM module to handle the specific method. There are three main PAM implementations in use today: Solaris PAM used by the Solaris OS, Linux-PAM used by all Linux "distributions", and OpenPAM used by BSD-derived systems. Linux-PAM is also the PAM implementation used by Debian. One very unfortunate fact is that, while PAM itself provides a standardized API even for requesting additional input from the user (which is quite a feat), it does not standardize the logging interface. Some Linux-PAM modules do not log at all, and those that do are not forced to consistency by formal methods. This is such a critical omission that it consequently puts PAM practice in a completely different light. The solution to the PAM logging problem, however, came unexpectedly. Sebastien Tricaud added &PRELUDE; support to PAM 0.79, so PAM can now consistently report all the action to the Prelude manager.

System task schedulers Computers do one thing well - they happily execute highly-repetitive tasks that you could never complete yourself in a reasonable amount of time (let alone the boredness experienced along the way). From that perspective, it's obvious that every serious operating system should have a way to schedule tasks for execution at some later, future time, or in a repetitive (periodic) fashion. The "pioneering" work in automated schedulers was done by a chief of an IBM-powered farm (with crops, animals and all), back in the 1970's. He reduced three 8-hour shifts to two 8-hour shifts, replacing the third person (ho had practically nothing to do but run one system command at 3:00 am) with a timer-powered Lego block that would drop from a height onto the Return key. Unix systems today have two schedulers available — Generic NQS.

At As you might conclude from the command name, /usr/share/doc/at/timespec. For example, you could try echo "echo Hello, World" | at now + 1 minute. In a minute, you should see "Hello, World" in your mailbox. The example supplied the command "in place", but this is Unix so you can also save the set of commands to execute in a file (say, at -f cmds.at now + 1 minute. You can view pending jobs by running

Cron In essence, cron configuration file consists of each task defined in its own line. In turn, each line consists of 5-field time specification, and the task to execute. The first five fields indicate minutes, hours, days of month, months, and days of week. Here are a couple of examples to clarify the subject: # Run each minute * * * * * /usr/local/bin/syscommand # Run every 15 minutes 0,15,30,45 * * * * /usr/local/bin/syscommand # Run every 15 minutes, enhanced specification */15 * * * * /usr/local/bin/syscommand # Run every 2 hours * */2 * * * /usr/local/bin/syscommand # Run once every hour in period from 8:00 am to 3:00 pm 0 8-15 * * * /usr/local/bin/syscommand Cron configuration file is interesting. Make sure you read

System crontab As usual, &DEB; contains crontab in the base system. There's a number of great things going on on the system, even when you've installed nothing but the minimal setup. &DEB;'s system crontab file is (would you guess?) /etc/crontab. You can see that, in between the time specification and the command to execute, this specific file accepts the Unix username to run the task as. (While this itself is convenient and easy to look into, you can of course specify a different username in the command specification as well). Furthermore, you see that &DEB; prepared /etc/cron.*/ directories where both you and packages' postinstall scripts can simply drop tasks to execute. For example, if you want to execute once a day, just drop a script to /etc/cron.daily/. If, on the other hand, you want to exactly control the time, drop a file in /etc/cron.d/, where crontab config files are expected (or, if you must, edit /etc/crontab directly).

Users' crontab System users can also have their crontabs. All you have to do as a system user, is to run Besides running crontab . Administrators can allow or forbid system users to use crontab; look for cron.allow and cron.deny in the crontab manpage.

Inet Meta Daemon Inetd is yet another interesting concept but it needs a little general introduction first. As you might or might not know, Unix Following the above logic, it became meaningful to have a specialized server that only listens for client connections, and then forks the appropriate daemons to handle actual requests. The result is the Inet Meta Daemon. Inetd, however, did not have a shining security record, and it became too inflexible and slow for today's standards. In addition, Inet needed non-transparent support in every server program, so no wonder it slowly got out of mainstream Unix. But we still mention Inetd here for numerous reasons; it's an important part of Unix, it's still being useful for particular applications, and it can be easily overlooked when trying to increase overall network security of your system. There are a few Inetd implementations, but the default used by &DEB; is the openbsd-inetd variant. (Previously, Debian used the implementation from the venerable NetKit, still available as package /etc/inetd.conf and you should disable all the unnecessary services in it — probably all there are — and call the usual sudo invoke-rc.d openbsd-inetd reload.

E-Mail Debian uses an extensive e-mail system based on the Exim mailer. See packages Exim is to elaborate to cover here. What's important is that at installation time, it asks you a couple questions and in most cases configures a basic, working email server on the machine. From that point on, it's easy (or "easier") to implement your modifications or setup requirements. The upside is that, being the Debian default, it got all related Debian packages to work with it out of the box, so you can get many additional programs, such as greylistd (greylisting implementation — one of spam prevention methods) or mailman (mailing lists manager) to work with it with no or minimal effort on your part. To get a grip on Exim, see documentation at Exim.org. To get a grip on Debian packaging and file layout, see /usr/share/doc/exim4/README.Debian.gz.

Tcp Wrappers To restrict access to our systems and services, we can use packet-level and application-level solutions. Packet-level solutions are usually called We can, however, control access on an application level too. Application-level control can be implemented using proxies (content-based), TCP Wrappers (source/destination-based), custom methods, or a combination of those. As the section title says, we're going to take a look at TCP Wrappers here. Basically, TCP Wrappers serve as a generic application-level access control mechanism, and were first developed by Vietse Venema. TCP Wrappers were most useful in combination with Inetd, but have been since integrated into a number of standalone services. When a packet reaches the system (and the corresponding service listening for requests), all the application has to do is call for a TCP Wrappers check. Based on connection details (remote IP, remote username, destination service etc.), TCP Wrappers pass or deny requests. At that point, the application either continues with the client authentication (username/password mostly), or closes the connection. Tcp Wrappers are a standard part of Debian. For more information see and manual pages. TCP Wrappers can also serve as an example of professional programming practices — they come with a set of additional programs developed to conveniently test your configuration files and hypothetical connections; see and manual pages. To deny all services to remote addresses, make sure the file /etc/hosts.allow is empty, and put this in /etc/hosts.deny: ALL: ALL EXCEPT LOCAL 127.0.0.1: DENY For more information (including on how to trigger system commands upon incomming requests) read and manual pages. Please Note: Tcp Wrappers and a firewall have very little in common; the level at which the allow/deny decision takes place is fundamentally different. With a firewall, it happens on a lowest, packet level: the packet targeted at say, an FTP port, could be dropped by the firewall as soon as it gets received by the network hardware and processed by the operating system's network layer — it would never reach the FTP daemon. With TCP Wrappers, the packet does reach its destination (Inetd, or a standalone service). The validity check must be explicitly called for by the handling application, and is usually performed before the server forks ( Today, one of the most known uses of the TCP Wrappers is via the /etc/hosts.deny. When that happens, the client will see connection error: ssh_exchange_identification: Connection closed by remote host. The IP will be expired from the list after a while automatically. (On a side note, your first line of defense on SSH is to deny direct root logins using "PermitRootLogin no" in /etc/ssh/sshd_config. Denyhosts will then take care of the rest).

Conclusion Congratulations on following through the Guide. I initially wrote it in 2002, and things have changed enormously since then. However, better understanding and deeper knowledge always have a value, and with Linux — maybe even more so today then they've had before. There are a couple other sections of the Guide I had in mind, but I either didn't get a chance to write them, or the systems I described changed their implementations radically enough that chapters needed a complete rewrite, and in the absence of time to do it I just removed them. Some sections are also of lower quality, text-wise, but they nevertheless contain various interesting technical bits. Anyways, altogether, I hope you enjoyed this brief "mix of everything"! I invite you to continue reading other, more serious guides from the Spinlock Solutions' DKLAR series, the &DKLAR-KRB;, &DKLAR-LDA;, &DKLAR-AFS; and &DKLAR-RAD;. Cheers! Davor Ocelic, Spinlock Solutions