Previous Page TOC Next Page


The World Wide Web

What is the World Wide Web?

The World Wide Web, commonly called the Web or the WWW, is a tool you use on the Internet to give you access to information stored on other systems throughout the world. The documents and the Web are linked via the HyperText Transfer Protocol (HTTP). From your local host system, you activate the link to the remote host system with the information you want.

The Web has powerful linking abilities to other Internet services and resources worldwide. The Web consists of a body of information protocols, conventions, standards, and concepts. The best known of these concepts is HyperText, described later in this chapter. See also Some World Wide Web Protocols and Standards.

Information protocols can differ greatly from one another, but there are some similarities in how they operate.

For example, some Web protocols follow the client-server model (See Figure 2). Generally, a client tries to connect with a server, sends an information pointer or a request, receives a response, and then closes the connection with the server. The response can be a file to display pointers to other servers.

Figure 2 A Client/Server Model


Undisplayed Graphic


Once information is available on the Web, it should be accessible from any type of computer in any country. A person can obtain the information using a simple program called a browser that lets you view Web pages, which are documents prepared for use on the Web. Web users can follow links from page to page by simply pointing to a link with the mouse and then clicking on that link. The Web site that publishes a piece of information stores that information on its Web server.

For more information, see the chapter entitled Browsers and Servers.

How Did the World Wide Web Start?

Before the Web existed, information on the Internet was available in text form only and accessed by command line procedures. It became apparent that graphics, video, and sound capabilities were needed, plus an easier way to access information.

In 1990, the European Laboratory for Particle Physics (CERN) in Switzerland started the World Wide Web as a distributed hypermedia network service.

The initial development consisted of defining the HyperText Transfer Protocol (HTTP), the development of a sample server, and a programming library. HTTP is the client/server protocol still used for the operating rules and procedures for the Web.

In 1992, CERN placed the Web software in the public domain, accessible by the Internet community. Quickly, organizations and individuals around the world started using the Web, developing browsers, adding new features, and developing support for additional platforms. Today Web servers and Web clients support all major operating systems and computer architectures.

Who Owns the Web?

No single organization or person owns the Web. CERN is the original home for the World Wide Web Initiative, a cooperative organization that defines and supports the programming languages and protocols that make up the Web. Other organizations throughout the world also participate in developing components for the Web in cooperation with the World Wide Web Initiative.

The current home of the World Wide Web Initiative is the Massachusetts Institute of Technology (MIT) in the United States.

For more information about the Web Initiative and to view its home page, use your browser and go to: http://www.w3.org/hypertext/WWW/TheProject.html

What are the Web’s Capabilities?

As mentioned, before the World Wide Web existed, the Internet provided text-only material. With the Web, information can contain graphics, video, and sound with text and include links to other pages located anywhere in the world. The Web capabilities make it an extremely attractive medium for creating and accessing information.

What is HyperText?

HyperText is text with links that let users access another HyperText page by clicking on a highlighted word or phrase. Text with links is not a new idea. For example, book authors and publishers include references between the table of contents or index and the text. A book can also include references to other books and papers. With HyperText, the computer makes following such references, or links, as easy as turning the page.

Web users can also do a text search. A number of search engines are available on the Web, but they all work the same way. You type in some text and you get back a HyperText answer that points you to the items found by the search. Figure 3 is an example of HyperText links between pages.

When reading a HyperText page, you do not have to follow the sequential organization of the pages. You can pursue a thread of your own. This makes HyperText an incredibly powerful learning tool. HyperText authors design their material to make it open to active exploration, with links that can lead from all or part of a page to all or part of another page.

Pages can be text, graphics, movies, and sound. Because hyperlinks can be between different media, people use the term HyperText and hypermedia interchangeably. The term hypermedia means multimedia HyperText.

Figure 3 HyperText Links Between Pages


Undisplayed Graphic

What Can the Web do for Users?

Using the Web, users can browse, navigate, and retrieve information by creating and interacting with HyperText pages.

The Web provides a way for users to link highlighted words and pictures within a page to other parts of the page, to other pages, or even to video sequences and sound files. System Administrators can store this information on entirely different systems as text, numerical data, images, sound, animation, or video.

The Web also lets users search Internet resources for information. For example, resources can be located on a local system or on a Web server anywhere in the world.

Using a browser with a graphical user interface (GUI), Web users can follow links by simply pointing to the link with the mouse and clicking. Clicking on HyperText links moves you to associated information.

Who Uses the Web?

All kinds of individuals, institutions, organizations, businesses, and governments use the World Wide Web for a wide range of purposes. Your imagination is your only limitation.

How do I Connect to the Internet and Web Resources?

You can get to the Web resources using a browser and a direct connection or an indirect connection. Both types of connections can use either a modem or network card to make the connection.

To take advantage of the full graphic capabilities of the Web you need to have a direct Internet Protocol (IP) connection.

Indirect Connections

When you have an indirect connection, your computer is one of many terminals connected to another computer. This computer, in turn, has an IP address and connects to the Internet. Your computer does not have its own IP address; it is not a node on the Internet.

If you have an indirect connection, you have disk space and access time on the computer with the IP address. This computer stores the files that you transfer from the Web and e-mail that you receive. You can, of course, transfer this information to your own system.

Direct Connections

When you have a direct connection, your computer has its own IP address and is an independent node on the network. With this type of connection, you can take full advantage of the multimedia capabilities of the Web.

Any files that you transfer and e-mail that you receive via the Web reside on your system. You can create a Web page, store it on your system, and use your browser to test your new page. You have full control over which programs you download, install and run.

Using a modem and a special network account, you can establish your system as a temporary node on the Internet. You can get your own Internet address by using the Serial Line Internet Protocol (SLIP) or the Point-to-Point Protocol (PPP), then connect to a server that then connects to the Internet. This server assigns your local system an IP address, which your machine retains until you log off.

In most cases, the IP address assigned to your system is issued dynamically, meaning that your system receives a different address each time you log on. In some cases, however, this address is static, and your system is assigned the same IP address at each login session.

Getting More Information

You can get more information about connecting to the Internet by any of the following methods:

Using the Web

There are some Web requirements and characteristics that you should know when using the Web. For some users, these can be an advantage and to others, a disadvantage. In order to determine what is best for you, you should know the following:

Web Processes and Software

There are certain processes and software you use to access the World Wide Web.

You need the Web browser software installed on your system. Your system is considered the client system when you access or retrieve pages using your Web browser. (Some people actually call a browser a client for this reason.) The system that has the page you want is called the server, because, in a sense, it serves your request.

You request the page by providing the browser with a Uniform Request Locator (URL). A URL is essentially a character string that identifies the location of the information you want.

See the chapter entitled Uniform Resource Locators for more information.

Before the Web came into use there were, and still are, a variety of servers that distribute information on the Internet, servers such as Gopher, FTP, and Telnet. Each of these servers uses its own protocol to service requests and transactions. These protocols cannot "talk" to each other without the help of an additional protocol.

The Web also uses its own protocol, called HyperText Transfer Protocol (HTTP). Though this protocol is proprietary to Web traffic, many servers have the capability to perform proxy function to other protocols. This means that a request for a different protocol, such as FTP, can be made of a Web server through a browser. The server identifies the protocol requested, and make an appropriate call to another server. It then converts the data received into a Web-readable document, and return that page to the browser.

In addition, many browsers will also support different protocols, making it possible to perform calls directly to servers running different protocols. If the information you need is on a Gopher server, the browser acts like a Gopher client and uses the Gopher protocol. If the information is on an FTP server, the browser acts like an FTP client and uses the FTP protocol.

Information on the Web is known as a page, whether it is one page or several pages. The standard tool for creating Web pages is the HyperText Markup Language (HTML). While Web browsers can understand many different tools, every Web browser understands HTML.

Client/Server Communication

Client/server communication is a two-way street. Your system is the client, and it accepts queries from your browser for information, sends them to the server, and then displays the results.

A server is the system that has the information you want. Client and server software can be on the same computer, but usually they are on different ones. The server performs tasks, processes requests, searches for information, or executes commands, all as directed by the client.

At any given time, a client can be a server, and a server can be a client. When your system has information someone else wants, their system is the client and your system becomes the server. This client/server relationship makes it possible for any connected computer to provide services to any other connected computer.

Some World Wide Web Protocols and Standards

The World Wide Web consists of a body of information protocols, standards, and conventions that govern its use. These information protocols can differ greatly from one another but they all allow all the clients and servers to communicate. The information in this section, organized by function, briefly describes the most common Web protocols and utilities.

Resource Addressing

Uniform Resource Locator (URL): A standard for identifying objects on the Internet accessible via the World Wide Web. Figure 4 shows four sample URLs.

Behind every link in a page is the network-wide address of the page to which the link refers. This network-wide address is referred to as a URL.

A URL is a string of characters that uniquely identifies an object on the network. You can think of a URL as sort of a catalog number for an Internet resource. A URL lets you specify the address for any object anywhere on the Internet, even though you access these objects using a variety of different protocols.

Figure 4 Sample URLs


http://www.w3.org/hypertext/WWW/TheProject.html

ftp://ftp.w3.org/pub

news://comp.infosystems.www

gopher://marvel.log.gov:70/


For more information, see the chapter entitled Uniform Resource Locators.

File Transfer

HyperText Transfer Protocol (HTTP): A stateless search-and-retrieve protocol for World Wide Web operations.

The Web uses its own HyperText Transfer Protocol (HTTP) to do file transfers. This protocol is fast, stateless, and extensible. It helps solve the problems of different data types by using negotiation of the data representation.

Other protocols that Web clients can understand include FTP and Gopher, and sometimes depending on the client and server software, WAIS and NNTP.

For more information, see the chapter entitled HyperText Transfer Protocol

Document Markup

HyperText Markup Language (HTML): The standard for writing and formatting HyperText pages on the World Wide Web.

Web browsers can understand many different formats, but there is one basic format every Web client understands: HTML. HTML is a subset of the Standard Generic Markup Language (SGML) document code that allows structured text with links. HTML is valid SGML. HTML defines the logical structure of the document instead of its formatting. This allows browsers to display HTML pages on different platforms using different fonts and conventions.

HTML code is a set of tags for a particular SGML document type.

For more information, see the chapter entitled HTML Page Design and Creation.

Program Execution

Common Gateway Interface (CGI): A standard mechanism for invoking programs from World Wide Web servers using GET and POST requests.

CGI is an interface that lets you run external programs from an HTTP server. A CGI program processes client requests and returns the appropriate information in a form usable to the client.

CGI programs can prompt the user for input such as doing simple searches or using forms.

For more information, see the Purveyor Encrypt WebServer Programmer’s Guide and the default home page supplied with your software.

Clickable Images

This capability is a popular additional server feature that allows users to navigate the Web by clicking on predefined areas within a graphic image. When you click on a section or point to the image, the linked page or image appears.

For more information, see the chapter entitled Clickable Images.

Working with the Web

The World Wide Web provides people with a flexible communication and information retrieval system unlike any that has been used before. Knowing the basic information about the Web can give you ideas for how it can benefit you personally, your business, or your organization.

To put those ideas into action you need to know more about the inner workings of the Web, how to retrieve information, and how to create your own Web pages. The following chapters explain the Web and these subjects in more detail.


Previous Page Page Top TOC Next Page