bembry.org
Home / Technology / Inet

Site Function and Design

The following set of notes corresponds to the Site Functionality Design module of the Internet Design course, providing students an outline of the information they are expected to understand. The corresponding reading for this module is chapter nine of i-Net+ Study Guide by David Groth, et. al., pages 392-442.

Server / Client Interactions

  • Studies indicate that a typical Internet user will become frustrated and consider leaving a site if it takes longer than 10 seconds to load.
  • When a user types in a URL, the following events occur before they see the web page:
    • URL looked up in DNS and connection created between client and server
    • Request waits in queue on server
    • Server responds to request
    • Data transmitted back to client
    • Client software decodes and then displays data
  • The time it takes for a web page to be downloaded by a user may be estimated by dividing the size of the page in bits by the bandwidth in bits. This will yield an ideal data transmission time; network lags, DNS lookup and browser translation time must also be factored in to provide a better estimate of the time needed to download a page.
  • Possible reasons for a web site request failing include the following:
    • The URL is not valid
    • The client software is not configured correctly
    • The client's Internet connection is down
    • The server computer is down
    • The server's Internet connection is down
    • The server is currently too busy
    • The server's DNS files are corrupt
  • MIME-Types identify to a web browser what type of application a file is associated with. The MIME types are stored in a database on the client web browser. If a server sends out a file that has a MIME type unknown by the browser, the browser will either prompt to download the proper plug-in software, or display an error.

Content Planning

  • A web site should be designed for its intended audience, keeping in mind their skill level, available technology, and interests.
  • Advanced features, such as animation or multimedia, should be used on a page only if they serve the page's purpose. Adding advanced features to a crummy page will only make it worse.
  • Avoid using plug-ins that are available for only a certain operating system.
  • For maximum compatibility, design the web page to fit within the 640x480 resolution used by a number of monitors (keeping the page at a maximum width of 640 pixels). Otherwise, if the the page is designed to display larger than the screen resolution of the client monitor, the user will be forced to scroll in order to view the entire page.
  • Determine a policy to apply when designing your web site. Policies may include captive audience, which designs for a specific browser and plug-ins determined by the designers; lowest common denominator, which designs so that the page will work in any browser, including text-only browsers; 85% policy, where the site is designed so that most people can see it, but does not worry about reaching users without the technology for the site; or adapive content, which is designing the page so that it can both take advantage of advanced features on clients that support them, and dismiss these features for clients unable to use them.
  • Two ways of making a web site suitable for both high and low end browsers are differential content and graceful degradation. Differential content involves creating two web sites, one with advanced functions and another with only basic funcitons, and diverting users to the site that suits their needs. Graceful degradation is designing a single site in such a way that advanced features will be implemented if the browser supports them, but will be ignored if the browser does not support them.
  • Policy should be written and have specific details on page sizes, browser support, multimedia, and any other basics of the page design.

Server Planning

  • A queue is a set of objects waiting in line. In relation to servers, a queue is the number of requests waiting to be serviced.
  • The utilization rate of a server is the rate of new requests divided by the maximum rate at which the requests can be fulfilled. A server that can handle a hundred requests per second and is receiving 50 requests per second has a utilization rate of 0.5.
  • If the utilization rate approaches or exceeds 1, the computer will fall behind, causing delays for clients trying to access the server. To maintain a high utilization rate, you must reduce the number of requests coming in or increase the number of requests the server can handle.
  • The rate of fulfilled requests is equal to the number of active children divided by the time each request takes. Active children are processes (or threads) being acted upon by the computer.
  • Network performance is affected by the network bandwidth, amount of RAM, CPU speed, disk input/output speed, and database connections. Any one of these areas may impede performance and produce a bottleneck. The most common bottleneck for a web server is the network bandwidth.
  • Redundancy is one the best back-up supplies to have in case of an emergency. Having a second server, Internet line, power supply, or other component always available can mean the difference between having no down time and being down for days.

Caching

  • Caching is storing information in a quick-access area (such as RAM or a local server)to make it easier to access in the near future.
  • In order to be useful, the files in a cache must be the most recent, up-to-date version. To keep information current, the cache may send out a query to the original document, checking if the cached copy is still the same version as the most recent original. If it is out-dated, the cache will download the newest version.
  • Most web browsers perform caching by saving all accessed web files on the hard drive. If the user returns to a cached web site, the browser will then determine whether to use the cached copy or reload the original web site. The rules a browser uses to determine when to use the cached file can be set within the browser configuration system.
  • A proxy server may provide caching for the network it serves, thus increasing speed for the users on the network. Proxy caching may be established on a LAN, on a dial-up ISP, or even among other proxy servers to create a regional proxy system.
  • Reverse proxy servers (or caching Web servers) sit between a web server and the rest of the network, providing cached copies of the files on the main web server. Using a reverse proxy helps to lower the demands placed on the main server and may speed system response time. However, reverse proxies are not very common.

Searching and Indexing

  • A site map, or site index, is like a table of contents for a web site. The site map lists the main sections of the site and the topics or pages available for each section.
  • A simple search searches a database of web sites for all occurances of a specified word. A simple search can yield thousands of web sites, which is more than most users want.
  • Different search engines use different rules to develop more complex searches. Commonly, a + before a word indicates that is must exist in the results, a - indicates it must not exist in the results, and the word AND signifies that both the first and second word must appear in the results.
  • A reverse index is a list of all the unique words that appear in a set of documents linked to the documents which contain those words. Reverse indexes make it much easier to locate the documents that contain a certain set of words.
  • Searches may be refined and improved by focusing only on the information in META tags, finding sites that use the desired word most frequently, by delivering the most popular web sites at the top of the list, or by using various language concepts to better refine the search.
  • Language concepts used by search engines include omitting unnecessary words (a, an, the), stemming words (recognizing the various forms of a word as relating to the same concept, such as bake, baked, and baking), classifying the meaning of a word based on its context (so that a search for "fly fishing" does not yield resulst about button fly jeans, planes that fly, and the common house fly).
  • Different search engines employ different techniques to yield their results. Some rely on various data searching techniques, while others incorporate human editors and some sell their search results to the highest bidder.
Restricted access