I'm trying Beta8, and "?sid=Echo.Boot", "?sid=Echo.ContentPane", "?sid=Echo.Label" all have a "Cache-Control: no-store" header. Is this a known/temporary oversight?
BTW, lines
response.setHeader("Cache-Control", "max-age=3600");
response.setDateHeader("Expires", System.currentTimeMillis() + (86400000));
in WebContainerServlet produce conflicting expiration data,
This could be done better,
This could be done better, but bear in mind these files generally will be loaded one time per application session, and tend to be in the 200k range (and gzipped for non-IE browsers). The reason for its existence is to avoid executing out-of-date JavaScript after a new app is deployed to a server.
It's been a long time since I looked at this, but if I recall correctly, IE was the problem (surprise) and it tends to get it wrong without fairly draconian measures like no-store. I seem to remember that "doing it right" meant that it was nothing short of impossible to test new JS code with Internet Explorer without a clear cache/exit/restart operation. That said, it might be better to only invoke no-store if a JVM property is set (but that still might cause major headaches for people when they redeploy their apps with upgraded Echo versions/components).
Probably the best solution is to 1.) do it right as you've suggested and 2.) add a "server instance id" to all the service URLs ( a unique id generated when the server starts) to ensure that updated scripts are always loaded.
Tracker:
Tracker: http://bugs.nextapp.com/mantis/view.php?id=189
This could be done better,
This could be done better, but bear in mind these files generally will be loaded one time per application session,
That means every time the user hits the URL.
and tend to be in the 200k range (and gzipped for non-IE browsers).
Even if it's small, page loading time is reduced, because the browser have to do a TCP/IP roundtrip to fetch those parts.
I'm planning on a page which will be re-visited fairly often (todo list), and fast loading is an important issue.
http://bugs.nextapp.com/mantis/view.php?id=189
mentions having a building version after "?", to make the URLs unique,
but I wonder if the size of the file would be enough for that:
usually I'd put the File.lastModified() there, but I'm not sure if you have it
around, maybe you're loading those files via a ClassLoader.
File size might be sloppy, but it would allow for changing JS libs at runtime,
might be useful for you when debugging the JS.
Also, a static variable with a timestamp might be used instead of a build version, which might be easier to implement than having a build version.
P.S. In our CMS we use Cache-Control: must-revalidate and a cached Etag and it seems to work for IE, but in case of Echo3 checksumming might be an overkill.
P.P.S. Thinking some more on this, the best solution seems to have an instance variable in the WebContainerServlet (or in the ApplicationInstance, or some other instance), initialized by default to current time, but open to modification from the program. That way the program can either prolong the cacheability of JS libraries, or force a library refresh, which might be useful for Echo3 developers in order to tweak the library at runtime. Instance variables are just a better way to go in a multi-app environment, where every app might have different neets.
workaround
Here is a workaround which works for "Echo.Boot" (in Scala):
import nextapp.echo.webcontainer.{WebContainerServlet, Connection} import nextapp.echo.webcontainer.service.JavaScriptService val ECHO3SERVLET = new WebContainerServlet { { val services = WebContainerServlet.getServiceRegistry val js = services.get ("Echo.Boot") services.remove (js) services.add (new JavaScriptService (js.getId, "") { override def getId = js.getId override def getVersion = 12345 // Allows browser caching. override def service (conn: Connection) = js service conn }) } override def newApplicationInstance = new ProjectApp }I guess to fix this issue one has to modify JavaScriptService.forResource, JavaScriptService constructor and WebContainerServlet.process. One might enable caching per WebContainerServlet, allowing developers to tweak and test the library with caching disabled, while users would have the benefits of caching JS libs in the browser. (No URL modification is necessary). Etag can be used (from hashCode and length of the JS resource) to reuse the browser cache even after the expiration time.
I could come up with a patch if needed.
P.S. The absolute minimum is to move the caching decisions and handling into a separate protected method in WebContainerServlet, allowing the developers to tune it.
I'd like to be able to write
if (service isInstanceOf JavaScriptService) cache (86400); else noCache ();Use a hash of the resources
This is now becoming a major issue for us also.
A colleague of mine has suggested putting the hash of the content in to the service URL for javascript services.
This seems to make most sense to me. If the resource has not changed, it will remain cached.
Using a server instance id that is different each time the context deploys is not optimal, as resources that have not changed will be reloaded.
What do you think Tod? I am happy to make the patch.
Sounds like a great idea to
Sounds like a great idea to me, go for it!
This is coming along nicely.
This is coming along nicely. I am introducing an interface for services that are able to provide UIDs to represent their version. If the service implements that interface, then the UID is placed in to the URL and a very long expiry time (20 years is what Amazon uses with their versioned JS loading) is used, since the browser can effectively cache that resource perpetually. If the content of the resource changes, the UID will change, so the URL will change, so it will be picked up.
I am only implementing this for the javascript service for now, it may make sense to implement it in the static text service, I haven't thought that through so I am leaving it alone.
ETA a few days.
Wouldn't it be possible to
Wouldn't it be possible to just use the existing Service.getVersion() for this purpose? You could then modify the UserInstance.getServiceUri() methods to include this version id. WebContainerServlet already contains caching directives. Caching can be disabled by returning the DO_NOT_CACHE constant.
Existing JavaScriptService implementations can then return a different version as needed (most return DO_NOT_CACHE right now). The initial WindowHtmlService will use DO_NOT_CACHE so that the client is always forced to refetch the initial application HTML.
Does this make sense or am I missing something?
Niels
The thought had crossed my
The thought had crossed my mind. You cannot express an MD5 hash as an integer - it is too large. I could change the Service interface so that the version is a long....
That does make far more sense. Let me look into it.
I somehow missed the md5
I somehow missed the md5 hash, I thought that the developer was responsible for incrementing the version. Automatic hashing would be much easier indeed.
If we include the version id in the service url, we could also introduce caching for "plain" non-modified JavaScript services (without breaking backwards compatibility). This requires a creative solution for JavaScriptService.getVersion() though that fits in an int. We might get away with just returning the number of characters of the content.
Widening of int to long in the Service interface is not source backwards compatible, unfortunately.
Niels
So which is the right way to
So which is the right way to go:
It depends how near we are to wanting a final release. If we are near, then the breaking change does not seem like a good idea right now, and I would vote for the extension interface. If we are not near, then I would vote for the widening.
Would appreciate a vote from at least you and Tod.
You don't need to change the
You don't need to change the interface.
You can return the first 32 bits of the MD5 hash instead.
Alternatively, use CRC32 to calculate the hash.
This returns a long, but only the first 32 bits are significant. Should be faster than generating an MD5 also.
-Tim
If we were to use hashing
If we were to use hashing for the cache mechanism, I would prefer it to be 100% accurate. Checksums such as CRC are suitable for error detection, not intentional change. It seems that MD5 hashes are more suitable for the job.
If the extension interface
If the extension interface would introduce caching by itself, I would go for that solution. But, caching has already been implemented through the use of getVersion(). So besides complexity, it would also introduce possibly conflicting instructions.
Given we are in beta mode for about a year now (RC still has not been released), I vote for making the breaking change. This only affects people that have implemented custom services. Most of the time we implement custom JS-modules which are constructed through JavaScriptService, these do not break.
Niels
I plan to implement this at
I plan to implement this at the weekend, Tod I would appreciate a steer from you before then.
Am I correct in thinking
Am I correct in thinking that a 64-bit hash is "only" 4 billion+ times more likely to be unique than a 32 bit one? Is there anything in a MD5 that makes it less likely that a change will result in a collision? I thought the only point of an MD5 (or other secure hash algorithms, like SHA) was to make it difficult to find a collision.
There's no way for us to programmatically determine a last modified date on any resources (any class file, resource object, etc) I presume?
Could we just use server startup time (or rather time at which some class was first loaded) as the version number? Not sure if this could cause issues with clustering. 31 bits gives us 68 years with one second accuracy. It would force a reload if server was restarted or if you're doing interesting things with classloaders of course.
Sorry for being all questions and not having anything resembling real insight. Would like to avoid API breakage if possible, but also would prefer to not have two APIs. I'd really like to try shoving this information into the existing API w/ 31 bits with reliability if at all possible.
MD5 might indeed be a bit
MD5 might indeed be a bit overkill for our needs. But CRC32 is targeted at transmission errors and seems to be unsuitable for our needs.
Last modified can not be determined from within a classloader, only if the backing resource is located on disk (does not have to be the case).
Since server startup time is not associated in any way with modification of files, this would very likely result in issues with clustering. Server restart in itself is not really a problem (for me at least), we want to calculate the checksums only once. That said, a library is likely to stay the same across many server restarts, so clients refetching those libraries is a bit wasteful.
My thought on CRC32 vs. MD5
My thought on CRC32 vs. MD5 is that the first has a 1 in 1^32 chance, the second 1 in 1^64. The only other feature of MD5 is security, which provides no benefit if we are simply trying to determine a version.
In the case of each, you're not guaranteed that it's unique, it's just pretty darn unlikely. :D With a server-start date you're guaranteed the uniqueness, if the expiration is less than 2 billion seconds (68 years) away. A month should be plenty.
Figured as much on last modified from classloader. I don't even know if there's a way to find out where the backing resource *is*.
Any way we could share a start time across a cluster?
I would be happy to go with
I would be happy to go with the CRC32 check. Yes there is the possibility of collision, but it is still small.
I don't like the idea of the server start-up check, for two reasons:
1) It would require cluster synchronisation of some kind, which is added complexity
2) We make frequent deliveries of the software, i.e. frequent server restarts, and that would mean frequent re-loading of some hefty javascripts.
+1
I vote +1 on both points. No unnecessary cache refreshes and no backwards incompatibilities.
Niels
I've asked around people I
I've asked around people I respect, who all concur with using CRC32
I will implement this soon so long as no-one objects.
Sounds great to me. I'd
Sounds great to me. I'd say add a system property as well to force non-caching in case anyone wants the old behavior, but make CRC32 the default. I don't think anyone would ever actually use it (the old way) though.