Tuesday, November 4, 2008

Understanding RDC Page Turn times and caching

At this year's OCUG, I have heard a few discussions about RDC Page Turn times and how opening the "first page" takes a bit longer than opening subsequent pages. While it is true that page opening time profile varies (based on the state of the cache, among several other factors), strictly speaking, generic notion that opening the 'first page' takes longer is not technically correct and I feel the term 'first page' is used ambiguously in discussions.

Also at OCUG, Hauke Kindler had a good session where he presented a nicely co-related results of page opening times and bandwidth consumption under different conditions. Following clarifications here should throw more light on the data he presented and help understand RDC performance profile better.

Besides several other techniques, RDC 4.5.3 uses caching at various levels to improve overall performance and particularly to improve the page turn times. It's caching strategy is distributed across several application tiers. On the middle-tier, caching is leveraged at various application components such as: Oracle WebCache Server (at the very front of the request), Web server and the Application Serer. Typically, these mid-tier servers cache configuration is a one time activity which should be taken care by the RDC system administrators following product installtion.

On the other hand, Client tier caching in the end user's browser can have significant impact on the overall page turn experience and here is why.

The "very first time", when a new user (a user who never logged into RDC app before) accesses the RDC application, RDC app downloads a bunch of static files (like images, icons, templates, CSS files etc.,) and a compacted AJAX library file. If the browser cache is configured correctly, this will be a one time only deal and these files will never be downloaded again (unless they have been updated on the server due to upgrades/patches). Depending on the end user's network quality, this one and only time download may have a noticeable impact (only for users with poor connections) on the page turn times. To be very clear, while these files are cached, users don't incur this cost ever again event if the user closes the browser and comes back later and logs into the RDC app again.

So, the notion - images, templates and other static files are downloaded for the "first page" is incorrect but rather they are downloaded only during the "first ever time".

The next level client cache will kick into action, when a user opens a 'new CRF' for the "first time". Let me explain what I mean by 'new CRF' and 'first time' here. It means - user has never clicked on a particular CRF ever before 'in any study for any subject for any study-site'. In RDC 4.5.3, CRF html layout is only a template and is a seperate object from the data that is displayed inside it. In otherwords, for a given CRF, the same CRF html template file is used repeatedly to fill-in and display data for all subjects across all studies. So, if the CRF is accessed for the 'first time', then it will be downloaded only once from the server and cached on the client. So, once again, if the browser cache is configured correctly, user should be downloading a CRF template file only once for each CRF during it's life time. If a new version of a CRF layout is generated on the server using the layout editor tool, then it will be treated as a new CRF for caching purposes and will be cached afresh on the client.


The above behavior is true for next/prev buttons in the Data entry (DE) window also. When user launches a data entry window and clicks on the next/prev buttons, DE window follows the above cache behavior for opening the next/prev CRFs, and downloads the CRF html page only if it is not already cached.

So, a CRF template file is downloaded once and cached locally 'the very first time it is opened' and it has nothing to do with the "first page" notion and it doesn't really matter whether it is opened 'first' (by launching the DE window) or 'later' (by clicking on the next/prev button in the DE window).

Infact, it is not only the CRF html pages, but html for all dialog windows that pop-up in the data entry window (such as the discrepancy dialog, approve/verify dialog, blank tool dialog and scores of others) is all cached in the exact same above manner (cache on first time access) in the client tier.

So, if users browser cache is configured correctly (follow Oracle's recommendations), then only thing that gets downloaded 'everytime' from the server is 'NOTHING BUT PURE DYNAMIC DATA' that gets merged into and displayed in the corresponding CRF html template.

Btw, 4.5.3 browser clients don't need to send a request to the server to check whether a cached file is changed on the server or not. Even though there is no data exchange between the client and server, those pings will still have a severe impact on the overall performance. Thus we avoid them. In case if a locally cached file is changed on the server due to an upgrade/patching, RDC 4.5.3 automatically takes care of pushing the latest file to client and replacing the cached file. Individual browser clients don't need to worry about that.

No comments: