roxen.lists.roxen.general

Subject Author Date
RE: [PATCH 14/17] New module: gzip-on-the-fly Arjan van Staalduijnen <Arjan[dot]van[dot]Staalduijnen[at]rtl[dot]nl> 20-01-2009
I've been working on a similar module some time ago, but so far never properly
finished it.

I am not testing your code, but I notice a few things lacking which I bumped
into in my code.

For proper protocol cache support it should have a:

#if ROXEN_MAJOR_VERSION > 4
        id->register_vary_callback("accept-encoding", accept_encoding_callback);
#else
        id->register_vary_callback("Accept-Encoding", accept_encoding_callback);
#endif

....early in the code. That way the module can handle both types of browsers
(gzip-supporting and non-supporting) even while protocol caching is enabled.


The (filter incoming) result mapping should also be checked against having
"transfer-encoding" and "content-encoding" in the result->headers mapping (in
any kind of upper or lower case). If any is present, gzip compression was giving
problems. For that purpose you'll also need to register these:


#if ROXEN_MAJOR_VERSION > 4
       id->register_vary_callback("transfer-encoding",
transfer_encoding_callback);
       id->register_vary_callback("content-encoding", content_encoding_callback);
#else
       id->register_vary_callback("Transfer-Encoding",
transfer_encoding_callback);
       id->register_vary_callback("Content-Encoding", content_encoding_callback);
#endif


Also pay attention to 'Accept-Encoding: gzip' vs. 'Accept-Encoding: x-gzip'. I
don't know how relevant that difference is, but (years ago) there were issues
with the x-gzip versions. They demanded x-gzip to be returned as their resulting
encoding, or otherwise they'd fail.


Regards,


Arjan

-----Oorspronkelijk bericht-----
Van: Stephen R. van den Berg [mailto:<srb[at]cuci.nl>] 
Verzonden: Tuesday, January 20, 2009 3:32 PM
Aan: <roxen[at]roxen.com>
Onderwerp: [PATCH 14/17] New module: gzip-on-the-fly


---

 server/modules/filters/gziponthefly.pike |  102 ++++++++++++++++++++++++++++++
 1 files changed, 102 insertions(+), 0 deletions(-)
 create mode 100644 server/modules/filters/gziponthefly.pike

diff --git a/server/modules/filters/gziponthefly.pike
b/server/modules/filters/gziponthefly.pike
new file mode 100644
index 0000000..dae088e
--- /dev/null
+++ b/server/modules/filters/gziponthefly.pike
@@ -0,0 +1,102 @@
+// This is a roxen module which provides gzip-on-the-fly compression support.
+// Copyright (c) 2002-2009, Stephen R. van den Berg, The Netherlands.
+//                     <<srb[at]cuci.nl>>
+//
+// This module is open source software; you can redistribute it and/or
+// modify it under the terms of the GNU General Public License as published
+// by the Free Software Foundation; either version 2, or (at your option) any
+// later version.
+//
+
+#define _(X,Y)  _DEF_LOCALE("mod_gziponthefly",X,Y)
+
+constant thread_safe = 1;
+
+#include <module.h>
+
+inherit "module";
+
+constant module_type = MODULE_FILTER;
+LocaleString module_name = _(1,"Filters: Gzip-on-the-fly");
+LocaleString module_doc =  _(2,
+ "This module provides the gzip-on-the-fly filter.<br />"
+ "<p>Copyright &copy; 2002-2009, by "
+ "<a href='mailto:<srb[at]cuci.nl>'>Stephen R. van den Berg</a>, "
+ "The Netherlands.</p>"
+ "<p>Due to implementation mistakes by Microsoft this filter "
+ "uses the gzip format instead of deflate. </p> "
+ "<p>By setting a browser-cookie of gzip=0 one can disable compression, "
+ "whereas setting gzip=1 will force compression. </p> "
+ "<p>This module is open source software; you can redistribute it and/or "
+ "modify it under the terms of the GNU General Public License as published "
+ "by the Free Software Foundation; either version 2, or (at your option) any "
+ "later version.</p>");
+
+#define COMPRESSION_LEVEL  1
+
+void create() {
+  set_module_creator("Stephen R. van den Berg <<srb[at]cuci.nl>>");
+  defvar ("compressionlevel", 1,
+        _(3,"Compressionlevel"), TYPE_INT,
+        _(4,"Use 1 for fast compression, use 9 for slow compression.")
+          );
+  defvar ("minfilesize", 1024,
+        _(5,"Minimum file size"), TYPE_INT,
+        _(6,"Any data equal to or below this size limit will be sent "
+          "uncompressed.")
+          );
+}
+
+int before, after, nosupport, fixed, nowant;
+float time_spent;
+
+string status()
+{
+    return sprintf("%.1fM of %.1fM (%.1f%%) have been gained "
+		   "by using %.2f CPU seconds."
+		   "<br/>%d+%d/%d request%s were sent uncompressed "
+		   "due to lack of gzip support", 
+		   (before-after)/1024.0/1024.0, before/1024.0/1024.0,
+		   (before-after)*100 / (before+0.1), time_spent,
+		   nowant,nosupport, fixed, ((nowant+nosupport)==1?"":"s"));
+}
+
+mapping filter( mapping result, RequestID id )
+{
+  int len;
+  if(  id->misc->internal_get
+    || id->misc->gzippedalready
+    || !result
+    || !stringp(result->data)
+    || String.width(result->data)>8
+    || (len=sizeof(result->data))<=query("minfilesize")
+    || result->type
+     && !has_prefix(result->type, "text/")
+     && !has_prefix(result->type, "application/x-javascript"))
+    return 0;
+
+  id->misc->gzippedalready = 1;
+  before+=len;
+  fixed++;
+  if( (id->cookies->gzip != "1"
+      && (!id->request_headers["accept-encoding"] || 
+          !has_value(id->request_headers["accept-encoding"], "gzip" )))
+      || id->cookies->gzip=="0")
+  {
+    if( id->cookies->gzip=="0" )
+      nowant++;
+    else
+      nosupport++;
+    return 0;
+  }
+  time_spent+=gauge {
+    StringFile out=StringFile();
+    Gz._file c = Gz._file(out,"ab");
+    c->setparams(query("compressionlevel"),Gz.DEFAULT_STRATEGY);
+    c->write(result->data);
+    c->close();
+    result->data = out->outdata*"";   // Transform back to single string
+  };
+  after+=sizeof(result->data);
+  result->encoding="gzip";
+}



__________________________________________________________
Deze e-mail en de inhoud is vertrouwelijk en uitsluitend bestemd voor de
geadresseerde(n). Indien u niet de geadresseerde bent van deze e-mail verzoeken
wij u dit direct door te geven aan de verzender door middel van een reply e-mail
en de ontvangen e-mail uit uw systemen te verwijderen. Als u geen geadresseerde
bent, is het niet toegestaan om kennis te nemen van de inhoud, deze te kopieren,
te verspreiden, bekend te maken aan derden noch anderszins te gebruiken.

The information contained in this e-mail is confidential and may be legally
privileged. It is intended solely for the addressee. If you are not the intended
recipient, any disclosure, copying, distribution or any action taken or omitted
to be taken in reliance on it, is prohibited and may be unlawful. Please notify
us immediately if you have received it in error by reply e-mail and then delete
this message from your system.
__________________________________________________________