This page is stolen and adapted from here. I (RN: Marc Herbert) have only added the SVR4 column.

A comparison of various package formats.

This is a comparison of the deb, rpm, tgz, slp, and pkg package formats, respectively used in the Debian, Red Hat, Slackware, and Stampede linux distributions (pkg is the SVr4 package format, used in Solaris). I've had some experience with each of the package formats, both building packages, and later in my work on the Alien package conversion program.

I've tried to keep this comparison unbiased, however for the record, I'm a fan of the deb format, and a Debian developer. If you discover any bias or inaccuracy in this comparison, or any important features of a package format I have left out, please mail me so I can correct it. Several people have already done so. I'm also looking for data to fill in the places marked by `?'.

This comparison deals only with the package formats, not with the various tools (dpkg, rpm, etc.), that are used to deal with and install the packages.

Package format comparison table.

feature deb rpm tgz slp pkg
Security, authentication, and verification
signed packages yes[1] yes no no no
checksums yes yes no no yes
permissions, owners, etc yes yes yes yes yes
Usability by standard linux tools
recognizable by file yes yes no no yes
unpackable by standard tools yes no [3] yes yes [4] usually no [5]
metadata accessible by standard tools yes no N/A no usually no [5]
creatable by standard tools yes no yes no no
Metadata
dependencies yes yes no yes yes
recommendations yes no no no no
suggestions yes no no no no
conflicts yes yes no yes yes
virtual packages and provides yes yes no ?? no
versioned dependencies and conflicts yes yes no ?? yes
complex boolean dependencies yes yes [6] no no no
file dependencies no yes no no no
copyright info no [8] yes no yes no
grouping yes yes no no yes
priority yes no no yes no
Special files
config files yes yes no yes yes
documentation files no yes no no no [9]
ghost files no yes no no no
Package programs
binary programs allowed yes no ?? yes no
pre-install program yes yes no [10] no yes
post-install program yes yes yes yes yes
pre-remove program yes yes no [10] no yes
post-remove program yes yes yes [10] no yes
verify program no yes no no no
triggers no yes no no no
Scalability
no hard-coded limits yes yes [11] yes no usually no [5]
new metadata yes yes [12] N/A no usually no [5]
new section yes no no no usually no [5]
format version data yes yes no yes usually no [5]

What is compared.

Security, authentication, and verification.

This section deals with ensuring that you know who created the package, and that you can check the package installed on your system to see if the files in it have ben modified since you installed it.

signed packages
Does the package format contain internal support for a GPG or PGP signature that can be used to verify who created it?
checksums
Are checksums available for all the files in the package?
permissions, owners, etc
Is information on the files in the package, their proper permissions, sizes, owners, groups, major and minor number (for devices), etc, available?

Usability by standard linux tools.

Recognising that it's important sometimes to be able to peer inside packages without using their package managers, this section compares how the various packages can be processed with tools available on any linux system [2].

recognizable by file
Is the package format able to be recognized by file?
unpackable by standard tools
Can the package be unpacked using standard tools, without too much difficulty? (I add the proviso that it not be difficult because I don't mean can you edit a program file with vi and compile it with cc that is able to unpack the package, nor do I mean can you write a 20 line shell script that can unpack the package. The unpacking must be able to be accomplished quickly and must be a set of commands that can be remembered without difficulty.)
metadata accessible by standard tools
If the package has some sort of metadata (ie, package name, description, version) contained in it, can this data be accessed by standard tools, without too much difficulty?
creatable by standard tools
Can a package be created using standard tools, without too much difficulty?

Metadata.

Metadata is my term for the information about a package contained in the package. This includes things like the package name, description, and version number.

dependencies
A dependency says a package needs another package to be installed for the first package to work properly.
recommendations
A recommendation says a package will almost always need to have another package installed.
suggestions
A suggestion says a package may sometimes work better if another package is installed. The user can just be informed of this as a FYI.
conflicts
A conflict is a package that cannot be installed when this package is installed. One common reason is if the two packages both contain the same files.
virtual packages and provides
This means that there are so called "virtual packages", such as a web browser, or a mail delivery system, and packages can say they provide those virtual packages, while other packages can depend on the virtual packages.
versioned dependencies and conflicts
A package can depend on or conflict with (or recommend, etc.), a specific version of a package, or all versions > or < a given version.
complex boolean dependencies
This means that a package can depend on a package AND (another package OR a third package).
file dependencies
This means a package can require that some other package - any other package - be installed that contains a given file (like /bin/sh) [7].
copyright info
The package's metadata contains basic copyright information. This is useful for automatic copyright sorting, etc.
grouping
The package can be assigned to a group (ie, web browsers, libraries), which might be used to group the packages when viewing a list of available packages, etc. This makes it easier to deal with large groups of packages.
priority
The package can be assigned a priority, which says how important this package is to the system. For example, packages with high priority should be looked at carefully when you are setting up a system, but you can skip installing all the packages with low priority and still know you'll still get a functional unix system.

Special files.

The ability to categorize files depending on what they are used for, so they can be dealt with in special ways.

config files
Are config files supported? These are files that the user will typically want to edit, so when a new version of a package is installed, the package manager should be able to know to leave them alone, or do something smart like prompt the user for what to do if they have modified the files, or at least make backups of the user's changes before overwriting them. (Maybe I need more granularity here?)
documentation files
Can documentation files be specially marked? This could be useful to help a user find documentation.
ghost files
Ghost files are files that are not actually present in the package, but are listes as being a part of it once the package is installed. This is useful for log files.

Package programs.

These are programs that are contained in the package, to be run by the package manager when the package is installed, or uninstalled, or at other times.

binary programs allowed
Must these programs be scripts, or can compiled binaries be used as well?
pre-install program
A program to be run by the package manager before the package is installed on the system.
post-install program
A program to be run by the package manager after the package is installed on the system.
pre-remove program
A program to be run by the package manager before the package is removed.
post-remove program
A program to be run by the package manager after the package is removed.
verify program
A program to be run by the package manager when the state of the installed package is being verified.
triggers
This is a whole set of programs, that are run not when this package changes state, but when another package changes state.

Scalability.

How well the package format is able to grow to meet future needs. This is of great importance. Many of the comparisons above have little value in the face of this section, because new package programs, new metadata fields, etc can all be added to a scalable package format with little difficulty.

no hard-coded limits
Are there no limits hard-coded into the package format, that might prevent it from expanding to meet future needs? For example, are package names or versions of unlimited size?
new metadata
Can new information (text, binary data, whatever) be added to the metadata easily, without changing the package format?
new section
Can the whole new sections be added to the packages, without changing the package format? For example, could the package format be expanded to have a pgp signature attached at the end, or to have a second set of data files, compiled for a different architecture or with different optimizations, attached the end? This is the ultimate test of how flexible the format is, I'm basically asking, was it designed to cope with unforeseen new requirements?
format version data
Is there some way to look at a package and tell which version of the package format it is using? In extreme cases, this means, the whole package format can be thrown out and redesigned but old tools will still be able to read enough of the packages to know they can't deal with them.

Todo.

  • relocatable packages
  • support for arch name in metadata, arch indep packages
  • multiple version of the same package can be installed simultaneously (is this really a package format issue?)
  • info available to package programs -- The programs may find various information useful to make decisions while they are running. Of course, all of them can look at what's currently on the filesystem, run other programs and look at the output, etc. This lists other information that may be useful. (old package version, etc)

Footnotes.

1.

Not yet widly used though.
2. Why standard linux tools, not unix tools in general? It's been pointed out that eg, gzip is not at all standard on all the unix systems out there.
3. Rpm2cpio does part of the job, but it cannot extract metadata. It's not really a standard tool either.
4. Assuming that bunzip2 is a standard linux tool, or that the package uses gzip compression instead. Also, while it's easy to get at the package contents, not so for the metadata.
5. Most repositories use a specific "datastream" format, while some others simply use tarballs. In the case of tarballs, yes.
6. Rpm may depend on a list of packages, but boolean OR is not supported. However, boolean OR can be emulated using virtual packages and provides. This isn't quite as good, since it does require more coordination between packagers, but I'll let it slide and say "yes".
7. Some people consider file dependancies a gross misfeature.
8. Copyright info is included in debian packages, but not in an easily extractable format.
9. Fields exist, but there is no standard way to use them.
10. Supported by a version of this package format used at one time by SuSE Linux.
11. Technically, the rpm "lead" contains hard-coded limits on the package name, but the lead is no longer really used by anything except file.
12. To be useful, you need to get a tag number assigned to your new piece of metadata, which implies modifying the rpm program.


Copyright 1998-2000 by Joey Hess under the terms of the GNU GPL, either version 2 or at your option, any later version.


This page was generated from this source XML by this program.