Building Performance into Internet Applications

[article]
Summary:

Web applications are based on client-server, request-response mechanisms. Performance of any moderate- to high-functional Web application is directly proportional to the architectural decisions and the technologies chosen to support it. Performance also depends on the architectural awareness of a programmer and his ability to address it from the inception phase of any Web-based application. This paper also discusses various ways to enhance the performance of Web-based applications.

High-Level Architecture
At a high level, Web applications are usually divided into four basic layers. Layers 3 and 4 are optional and are chosen based on product requirements:

  1. Presentation layer (client side/user interface)
  2. Distribution layer (server side)
  3. Business logic layer
  4. Back end (database/external dependency)

The general flow of this architecture is as follows: The client (presentation layer) will request a URL. A Web server (distribution layer) will receive the request and carry the preliminary processing. Based on processing, the Web server will call the business logic layer. The business logic layer carries out further processing based on encapsulated business rules. The business logic will also interact with back-end database applications as well as any external applications. The business logic will return control to the Web server when processing completes. The Web server will send a response to the client.

Minimal Code over the Wire
The intention of any Web-based application developer is to make sure that minimal code goes over the wire. The speed of Web page loading into any Web browser is directly proportional to the speed of the client's network connection. Normally, Web page size ranges from 20-35K. Anything above 35K will lead to degradation in service. If the Web server is located on the other side of the globe, the time for data to travel will be proportional to distance and speed of the network. The deciding and controlling factor will be the size of the code traveling over the wire.

Content Expiration and 304s
It is sent to the client can be given an expiration date. Web servers can be configured to make data expire after a day, a week, or a month. The duration of expiration depends on how frequently the Web page is updated. If the Web page is updated every few hours, this setting may result in misleading information. But if the data is not frequently changed, this setting can be very handy. One request and response can be composed of multiple data packets. If the client browser requests a GIF file that the server has recently delivered, the server will send "304" back to the client notifying that it already has been delivered. The client browser will retrieve that file from the local cache (temporary Internet file folder). It is an excellent mechanism to save network bandwidth. But the client has to make a round trip before it realizes that the requested content already exists as part of the cache. Setting a content expiration on the Web server will prevent this from happening. In this case, the client will search its cache first before asking the server for it.

Include Files and Graphics Optimization
ASP/JSP pages and HTML pages can have "include (.js, .vbs, etc)" files to organize the code base and deliver the intended functionality. All of these include files will have to be delivered to the client before the client browser can process the request. The client browser will have to request these files sequentially and exclusively. In normal circumstances, each request has to bring all corresponding files down to the client. It is advisable to have one include file instead of ten. Practically, this is not possible as more than one developer may be working on the same Web page at one time. Segregating logic in several modules can allow developers to work simultaneously. Depending on the working relationship between developers, and team dynamics at the time of the build, a single include file can be generated to receive minimum 304s.

All images and cascading style sheets will be requested exclusively. It is more advisable to have one background image than four images joined to make one background image. It is advisable to have one CSS file than to have three for the same page. It is also recommended that gif and jpg files be optimized with the proper graphical tool. Using 256 colors instead of 32 will increase the file size. For example, saving the "Click Me" image with 32 colors (gif) gives 898 bytes of image size. But saving the same image with 256 (gif) colors will give 2.131K, and saving it as a jpg gives 2.83K of image size. Clearly, if the image quality does not suffer, to save network bandwidth the preferred file format should be gif with 32 colors.

Cross Boundary COM/COM+Calls
Frequently used object data should be saved within script variables. This will cut down on COM method calls, which are relatively expensive, compared to accessing script variables. COM calls should be minimized from ASP pages to achieve optimal performance. It is better to write a few lines of ASP code than to wrap it within a COM component.

In practice, accessing COM properties or methods can be deceptively expensive. Here is an example, showing some fairly common code (syntactically speaking):

Foo.bar.blah.baz = Foo.bar.blah.qaz(1)

If Foo.bar.blah.zaq = Foo.bar.blah.abc Then ' ...

When this code runs, here's what happens:

  1. The variable "Foo" is resolved as a global object.
  2. The variable "bar" is resolved as a member of "Foo." This turns out to be a COM method call.
  3. The variable "blah" is resolved as a member of "Foo.bar". This, too, turns out to be a COM method call.
  4. The variable "qaz" is resolved as a member of foo.bar.blah. Yes, this turns out to be a COM method call.
  5. Invoke "Foo.bar.blah.quaz(1)". One more COM method call. Get the picture?
  6. Do steps 1 through 3 again to resolve "baz". The system does not know if the call to "qaz" changed the object model, so steps 1 through 3 must be done again to resolve "baz".
  7. Resolve "baz" as a member of "Foo.bar.blah". Do the property put.
  8. Do steps 1 through 3 again and resolve "zaq".
  9. Do steps 1 through 3 yet another time and resolve "abc".

As you can see, this is terribly inefficient and slow. The fast way to write this code in VBScript is

Set myobj = Foo.bar.blah ' do the resolution of blah ONCE

Myobj.baz = myobj.qaz(1)

If Myobj.zaq = Myobj.abc Then '...

If you're using VBScript 5.0 or later, you can write this using the With statement:

With Foo.bar.blah

.baz = .qaz(1)

If .zaq = .abc Then '...

...

End With

Note that this tip also works with VB programming.

Optimized Logging
Server side logging can be expensive. The logging can be done in low, moderate, and high amounts. This is totally based on the architecture and the amount of debugging facility the product asks for. If every subroutine is logging entry and exit points, the cross boundary COM/COM+ calls can become expensive and can undermine overall product performance. Minimal logging can be useless from a developer's perspective. The balance has to be achieved somehow. This falls under designing, and product drives the need to implement it.

String Manipulation
String manipulations can be very expensive from a performance perspective. If ASP/JSP code is looping to formulate a string to send back to the client browser, efforts should be made to avoid this process. For example, many people build a string in a loop like this:

s = "

helps when the page is HTML based. However, with asp and jsp pages, it becomes too expensive to send all that data over the wire each time the request is made. The data can be dynamically displayed. No doubt, it is confusing and tedious from a programming perspective to use "document.write" to render HTML, but it does save network bandwidth as script files can be locally cached. There are many cascading style sheet editors on the market to facilitate this process of attaching cascading style sheets to HTML/ASP/JSP pages.

For Each fld in rs.Fields

s = s & "

Next

While Not rs.EOF

s = s & vbCrLf & "

For Each fld in rs.Fields

s = s & "

Next

s = s & "

rs.MoveNext

Wend

s = s & vbCrLf & "

" & vbCrLf""""

" & fld.Name & "
" & fld.Value & "

" & vbCrLf

Response.Write s

There are several problems with this approach. The first is that repeatedly concatenating a string takes quadratic time; less formally, the time that it takes to run this loop is proportional to the square of the number of records, times the number of fields. A simpler example should make this clearer:

s = ""

For i = Asc("A") to Asc("Z")

s = s & Chr(i)

Next

On the first iteration, you get a one-character string, "A." On the second iteration, VBScript has to reallocate the string and copy two characters ("AB") into "s." On the third iteration, it has to reallocate "s" again and copy three characters into "s." On the Nth (26th) iteration, it has to reallocate and copy "N" characters into "s." That's a total of 1+2+3+...+N which is N*(N+1)/2 copies.

In the recordset example above, if there were 100 records and five fields, the inner loop would be executed 100*5 = 500 times, and the time taken to do all the copying and reallocation would be proportional to 500*500 = 250,000. That's a lot of copying for a modest-sized recordset.

In this example, the code could be improved by replacing the string concatenation with Response.Write() or inline script (<% = fld.Value %>). If response buffering is turned on (as it should be), this will be fast, as Response. Write just appends the data to the end of the response buffer. No reallocation is involved and it's very efficient.

In the particular case of transforming an ADO recordset into an HTML table, consider using GetRows or GetString.

If you concatenate strings in JScript, it is highly recommended that you use the += operator; that is, use

s += "some string", not s = s + "some string"

Miscellaneous
Here are a few other considerations when taking performance measures:

  • Different isolation levels with IIS configuration should be tested in-depth. There are three isolation levels: low, medium, and high.
  • COM components should be configured appropriately for high performance. There are three configuration options: unconfigured, configured as a library option, and configured as a server option.
  • Use option explicit at the top of your asp pages. This forces developers to declare all of their variables. Declared variables are faster than undeclared variables.
  • Avoid redimensioning arrays.
  • Use trailing slashes in the directory URL. If they are not used, it will take two requests to get the URL resolved. The first request will go to server unresolved and second with added slash to be resolved.
  • Upgrade to latest and greatest. Make sure that all the bits and pieces that are being used on the server are frequently upgraded. Make sure that service packs and patches are applied.
  • HTML compression should be turned on with IIS.

References

  • "Designing Performance in to Your Web-Based Applications," IBM performance management and capacity planning services
  • "Object Oriented Performance Testing of Web Applications" by Dr. B. M. Subraya and S. V. Subrahmanya, IEEE
  • "Boosting App Server Performance," Application Development Trends, Vol 7, No II
  • "E-Business Testing: User Perceptions and Performance Issues" by Andreas Rudolf and Raniner Pirker, IEEE 0-7695-0825-1/00
  • "Performance Testing E-Commerce Web Systems" by Mike Hagen. Presentation paper, Vanguard Group, 3 May 2002
  • "Load Testing for E-Confidence," Segue software
  • "Scaling the Web" by Daniel. A. Menasce, George Mason University

About the author

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.