Bug#397121: unnecessary dpkg accesses to the available file

View: New views
2 Messages — Rating Filter:   Alert me  

Bug#397121: unnecessary dpkg accesses to the available file

by Michel Lespinasse :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Package: dpkg
Version: 1.13.24
Severity: normal

As reported under bug 395140, dpkg/dselect takes a lot of memory and
can easily push low-memory systems to swap. Most of this memory usage
is related to parsing the available file into an in-memory database.

I mentionned this issue to Matt Zimmerman and he made the following
observation:
> The fact that dpkg pays attention to the available file at all is a bug;
> it should only care about the state of the system and not about external
> repositories.  Only higher level package managers like apt and dselect
> should do that.

I think he's correct, in that dpkg only needs to consider dependencies or
conflicts with the installed packages on the system, and can safely ignore
anything that has not been installed. Any other behaviour is at best
undocumented, and most likely, a bug.

In detail, here is what I suggest:

dpkg --install / --unpack / --configure / --remove / --purge currently read
and write the available file, they need not touch it at all.

dpkg --get-selections / --set-selections / --clear-selections / --audit /
 --yet-to-unpack should not need to touch the available file either, I think.

dpkg-query should never read or write the available file except for the
--print-avail command.


The commands that write to the available file should be limited to:
dpkg --record-avail
dpkg --update-avail / --merge-avail / --clear-avail
(any writes to the available file will be lost at the next dselect update
anyway, but dpkg --update-avail needs to work for dselect update to work).

The commands that read the available file should be limited to:
dselect select (reads and rewrites today)
dpkg-query --print-avail (already done readonly today)
dpkg --forget-old-unavail (reads and rewrites today)
dpkg --predep-package (already done readonly today)


The attached patch implements this suggestion.

--
Michel "Walken" Lespinasse
"Bill Gates is a monocle and a Persian cat away from being the villain
in a James Bond movie." -- Dennis Miller

diff -ru dpkg-1.13.24.patch123/dselect/main.cc dpkg-1.13.24/dselect/main.cc
--- dpkg-1.13.24.patch123/dselect/main.cc 2006-10-28 00:53:05.000000000 -0700
+++ dpkg-1.13.24/dselect/main.cc 2006-11-05 02:27:31.000000000 -0800
@@ -348,7 +348,8 @@
 }
 
 urqresult urq_list(void) {
-  readwrite= modstatdb_init(admindir,msdbrw_writeifposs);
+  readwrite= modstatdb_init(admindir,(modstatdb_rw)(msdbrw_writeifposs|
+                                                    msdbrw_readonlyavail));
 
   curseson();
 
diff -ru dpkg-1.13.24.patch123/lib/dbmodify.c dpkg-1.13.24/lib/dbmodify.c
--- dpkg-1.13.24.patch123/lib/dbmodify.c 2006-10-28 00:53:20.000000000 -0700
+++ dpkg-1.13.24/lib/dbmodify.c 2006-11-04 23:01:11.000000000 -0800
@@ -177,9 +177,9 @@
   if (cstatus != msdbrw_needsuperuserlockonly) {
     cleanupdates();
     if(!(cflags & msdbrw_noavail))
-    parsedb(availablefile,
-            pdb_recordavailable|pdb_rejectstatus,
-            NULL,NULL,NULL);
+      parsedb(availablefile,
+              pdb_recordavailable|pdb_rejectstatus,
+              NULL,NULL,NULL);
   }
 
   if (cstatus >= msdbrw_write) {
@@ -212,7 +212,8 @@
   switch (cstatus) {
   case msdbrw_write:
     checkpoint();
-    writedb(availablefile,1,0);
+    if(!(cflags & (msdbrw_noavail | msdbrw_readonlyavail)))
+      writedb(availablefile,1,0);
     /* tidy up a bit, but don't worry too much about failure */
     fclose(importanttmp);
     strcpy(updatefnrest, IMPORTANTTMP); unlink(updatefnbuf);
diff -ru dpkg-1.13.24.patch123/lib/dpkg-db.h dpkg-1.13.24/lib/dpkg-db.h
--- dpkg-1.13.24.patch123/lib/dpkg-db.h 2006-10-28 00:54:46.000000000 -0700
+++ dpkg-1.13.24/lib/dpkg-db.h 2006-11-05 02:26:58.000000000 -0800
@@ -160,6 +160,7 @@
   msdbrw_flagsmask= ~077,
   /* flags start at 0100 */
   msdbrw_noavail= 0100,
+  msdbrw_readonlyavail= 0200,
 };
 
 struct pipef {
diff -ru dpkg-1.13.24.patch123/src/archives.c dpkg-1.13.24/src/archives.c
--- dpkg-1.13.24.patch123/src/archives.c 2006-10-28 00:53:31.000000000 -0700
+++ dpkg-1.13.24/src/archives.c 2006-11-01 02:02:05.000000000 -0800
@@ -945,10 +945,10 @@
   char *p;
 
   modstatdb_init(admindir,
-                 f_noact ?                     msdbrw_readonly
+                 f_noact ?                     (msdbrw_readonly | msdbrw_noavail)
                : cipaction->arg == act_avail ? msdbrw_write
-               : fc_nonroot ?                  msdbrw_write
-               :                               msdbrw_needsuperuser);
+               : fc_nonroot ?                  (msdbrw_write | msdbrw_noavail)
+               :                               (msdbrw_needsuperuser | msdbrw_noavail));
 
   checkpath();
   
diff -ru dpkg-1.13.24.patch123/src/enquiry.c dpkg-1.13.24/src/enquiry.c
--- dpkg-1.13.24.patch123/src/enquiry.c 2006-10-28 00:53:31.000000000 -0700
+++ dpkg-1.13.24/src/enquiry.c 2006-11-05 01:40:34.000000000 -0800
@@ -120,7 +120,7 @@
 
   if (*argv) badusage(_("--audit does not take any arguments"));
 
-  modstatdb_init(admindir,msdbrw_readonly);
+  modstatdb_init(admindir,msdbrw_readonly|msdbrw_noavail);
 
   for (bsi= badstatinfos; bsi->yesno; bsi++) {
     head= 0;
diff -ru dpkg-1.13.24.patch123/src/packages.c dpkg-1.13.24/src/packages.c
--- dpkg-1.13.24.patch123/src/packages.c 2006-10-28 00:53:32.000000000 -0700
+++ dpkg-1.13.24/src/packages.c 2006-11-01 02:07:00.000000000 -0800
@@ -68,9 +68,9 @@
   size_t l;
   
   modstatdb_init(admindir,
-                 f_noact ?    msdbrw_readonly
-               : fc_nonroot ? msdbrw_write
-               :              msdbrw_needsuperuser);
+                 f_noact ?    (msdbrw_readonly | msdbrw_noavail)
+               : fc_nonroot ? (msdbrw_write | msdbrw_noavail)
+               :              (msdbrw_needsuperuser | msdbrw_noavail));
   checkpath();
 
   if (f_pending) {
diff -ru dpkg-1.13.24.patch123/src/query.c dpkg-1.13.24/src/query.c
--- dpkg-1.13.24.patch123/src/query.c 2006-10-28 00:53:31.000000000 -0700
+++ dpkg-1.13.24/src/query.c 2006-11-05 01:20:26.000000000 -0800
@@ -187,7 +187,7 @@
   const char *thisarg;
   int np, nf, i, head, found;
 
-  modstatdb_init(admindir,msdbrw_readonly);
+  modstatdb_init(admindir,msdbrw_readonly | msdbrw_noavail);
 
   np= countpackages();
   pkgl= m_malloc(sizeof(struct pkginfo*)*np);
@@ -321,7 +321,7 @@
     badusage(_("--%s needs at least one package name argument"), cipaction->olong);
 
   failures= 0;
-  if (cipaction->arg==act_listfiles)
+  if (cipaction->arg!=act_printavail)
     modstatdb_init(admindir,msdbrw_readonly|msdbrw_noavail);
   else
     modstatdb_init(admindir,msdbrw_readonly);
@@ -421,7 +421,7 @@
     return;
   }
 
-  modstatdb_init(admindir,msdbrw_readonly);
+  modstatdb_init(admindir,msdbrw_readonly|msdbrw_noavail);
 
   np= countpackages();
   pkgl= m_malloc(sizeof(struct pkginfo*)*np);
diff -ru dpkg-1.13.24.patch123/src/select.c dpkg-1.13.24/src/select.c
--- dpkg-1.13.24.patch123/src/select.c 2006-10-28 00:53:31.000000000 -0700
+++ dpkg-1.13.24/src/select.c 2006-11-01 02:20:42.000000000 -0800
@@ -47,7 +47,7 @@
   const char *thisarg;
   int np, i, head, found;
 
-  modstatdb_init(admindir,msdbrw_readonly);
+  modstatdb_init(admindir,msdbrw_readonly|msdbrw_noavail);
 
   np= countpackages();
   pkgl= m_malloc(sizeof(struct pkginfo*)*np);
@@ -94,7 +94,7 @@
 
   if (*argv) badusage(_("--set-selections does not take any argument"));
 
-  modstatdb_init(admindir,msdbrw_write);
+  modstatdb_init(admindir,msdbrw_write|msdbrw_noavail);
 
   varbufinit(&namevb);
   varbufinit(&selvb);
@@ -153,7 +153,7 @@
   if (*argv)
     badusage(_("--clear-selections does not take any argument"));
 
-  modstatdb_init(admindir, msdbrw_write);
+  modstatdb_init(admindir, msdbrw_write|msdbrw_noavail);
 
   it = iterpkgstart();
   while ((pkg = iterpkgnext(it))) {
diff -ru dpkg-1.13.24.patch123/src/update.c dpkg-1.13.24/src/update.c
--- dpkg-1.13.24.patch123/src/update.c 2006-10-28 00:53:31.000000000 -0700
+++ dpkg-1.13.24/src/update.c 2006-11-05 01:37:29.000000000 -0800
@@ -100,8 +100,8 @@
 
   if (*argv) badusage(_("--forget-old-unavail takes no arguments"));
 
-  modstatdb_init(admindir, f_noact ? msdbrw_readonly : msdbrw_write);
-
+  modstatdb_init(admindir, f_noact ? msdbrw_readonly :
+                                     (msdbrw_write|msdbrw_readonlyavail));
   it= iterpkgstart();
   while ((pkg= iterpkgnext(it))) {
     debug(dbg_eachfile,"forgetold checking %s",pkg->name);

Bug#397121: unnecessary dpkg accesses to the available file

by Guillem Jover :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

On Sun, 2006-11-05 at 02:55:58 -0800, Michel Lespinasse wrote:
> Package: dpkg
> Version: 1.13.24
> Severity: normal

> As reported under bug 395140, dpkg/dselect takes a lot of memory and
> can easily push low-memory systems to swap. Most of this memory usage
> is related to parsing the available file into an in-memory database.
>
> I mentionned this issue to Matt Zimmerman and he made the following
> observation:
> > The fact that dpkg pays attention to the available file at all is a bug;
> > it should only care about the state of the system and not about external
> > repositories.  Only higher level package managers like apt and dselect
> > should do that.
>
> I think he's correct, in that dpkg only needs to consider dependencies or
> conflicts with the installed packages on the system, and can safely ignore
> anything that has not been installed. Any other behaviour is at best
> undocumented, and most likely, a bug.

The only problem I see with this is that then dpkg will ignore
completely any override from the archive, which most of the time is
not really important as they affect stuff not used by dpkg for the
dependency resolution as you said.

The only important information might be the Section and Priority fields,
which are usually overriden and used to create roostraps or select what's
the base system, etc. But on the other hand I think all programs doing
that are using the archive Packages files for that purpose, so I guess
it's fine to apply this patch.

> In detail, here is what I suggest:
>
> dpkg --install / --unpack / --configure / --remove / --purge currently read
> and write the available file, they need not touch it at all.
>
> dpkg --get-selections / --set-selections / --clear-selections / --audit /
>  --yet-to-unpack should not need to touch the available file either, I think.
>
> dpkg-query should never read or write the available file except for the
> --print-avail command.
>
>
> The commands that write to the available file should be limited to:
> dpkg --record-avail
> dpkg --update-avail / --merge-avail / --clear-avail
> (any writes to the available file will be lost at the next dselect update
> anyway, but dpkg --update-avail needs to work for dselect update to work).
>
> The commands that read the available file should be limited to:
> dselect select (reads and rewrites today)
> dpkg-query --print-avail (already done readonly today)
> dpkg --forget-old-unavail (reads and rewrites today)
> dpkg --predep-package (already done readonly today)

Yeah seems reasonable, will review and merge.

thanks,
guillem




--
To UNSUBSCRIBE, email to debian-dpkg-bugs-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...