• subscribe
December 01, 1998 12:00 AM

Index Web Content with Site Server 3.0

Windows IT Pro
InstantDoc ID #4528

The next step involves testing how well your index and searches work. I used the recently released Windows NT Magazine article index database, which contains all Windows NT Magazine articles and other Web content. Because the database is relational, Site Server can cross-reference other materials that relate to the information a user searches on. For Site Server to cross-reference these materials, I needed to index the database's articles, which is the information users request most often on the Windows NT Magazine Web site.

After I selected the Articles table, I had to decide which content column and primary key I wanted to use for reference. I selected the Abstract field as the column and the ArticleID as my primary key. Site Server provides a sophisticated default output, so you can also tell the software which column to use for the cross-reference hyperlink; I used the Title (i.e., article title) column.

After I configured these settings, the next screen let me determine which columns the search engine searches and retrieves information from. If you mark a column as searchable, users can search on that field after Site Server indexes the column. If you mark a field as retrievable, Site Server makes that field content available for display after a user performs a search. I marked the ArticleID and IssueID fields as retrievable. Finally, I clicked the Build the databases catalog now box. Screen 3 shows the search page I used to test my new database catalog, and Screen 4 shows the details page after I selected a result from the search page.

As part of the Windows NT Magazine Web site, I used Allaire's Cold Fusion Application Server over Active Server Pages (ASP) to provide Web-to-database connectivity. If your enterprise is similar, you'll appreciate how easily you can migrate the Site Server Search summary ASP output to your Cold Fusion templates to process and manipulate the data further. Keep in mind when you create your catalog that you need to select the Site Server check box that contains your identity field (i.e., the field that makes each record unique) during the build process, and you'll have all that you need to finish the pages. Screen 5, page 186, shows the Site Server 3.0 results page after you build the catalog. I made a few formatting changes to make this page easier to read.

I wanted to link the results from Screen 5 to the Article Index pages. When I created the catalog, I marked the ArticleID and IssueID fields as retrievable. I needed to use the values in these fields to pull the queries on my article pages. So I opened the results.asp page in the \microsoft site server\siteserver\knowledge\search\database\search\articles directory and changed the value in the URL <% = RS("DocAddress") %>, which points to the view.asp page in the same directory with the ID value amended to it, to the location of the Cold Fusion page that I already use on the site. The new URL is http://servername/template.cfm?IssueID=<% = RS("IssueID") %>&ArticleID=<% = RS("ArticleID") %>.

Indexing and Cataloging a Remote Web Site
After I set up my site to index static files and an ODBC database, I wanted to index a site on another machine and then propagate the catalog back to the indexed server. To test this process, I indexed the Windows NT Magazine Web site from my home office. I wanted to create and manage the process from my home server, but I didn't want my home server to perform the actual indexing. Two servers (LiveServer and SearchServer) in my work office run Site Server 3.0. My objective was to index LiveServer from SearchServer and make the index searchable from LiveServer.

To start, I ensured I had proper permissions set on SearchServer. I opened the MMC, double-clicked SearchServer, right-clicked Catalog Build Server, and selected the Accounts tab. I made sure that both the Administrative access account and the Default content access account had the username and password of a user with permissions on LiveServer. To make things simple, I used my user account. (If you need to change the account information under the Accounts tab, you will probably have to reboot the server for the changes to take effect.)

Next, I checked to see whether I had added the hosts I needed to the MMC view. To add these hosts, I right-clicked the Search folder in the MMC and selected Add Host. I added LiveServer and SearchServer. Then on SearchServer, I right-clicked Catalog Build Server and selected New Catalog with a Wizard. I entered ToLiveServer for the catalog name, and I chose to do a File crawl. For the Start address, I entered the path to the \\LiveServer\ProfCon share. I then selected the names of the servers where I wanted to propagate the completed catalogs. I selected LiveServer and deselected the default setting, which was the name of my work-at-home server that I was using to perform this procedure. On the final screen, I selected the Start build now check box and clicked Finish. That's all I had to do to configure SearchServer to index LiveServer, create a catalog, and propagate the catalog back to LiveServer so users can perform searches.

Although Windows NT's Performance Monitor isn't very scientific, I used it to watch the two servers while the remote indexing took place. I immediately saw the advantage of removing the indexing function from LiveServer. Whereas SearchServer's processor usage during the indexing function worked up to 100 percent until the index was complete, LiveServer's processor usage never varied more than 10 percent.

To test how well my remote indexing and cataloging worked, I went to http://LiveServer/siteserver/knowledge/search/default.htm and entered a search word. Sure enough, the server displayed results with hot links to the right cross-referenced files. However, I noticed that all the links were file://LiveServer/filename.html. You can't avoid this type of labeling while you create the index, but you can remedy the situation after you build the index. To display the proper link names, I found the catalog I just built on SearchServer, right-clicked and selected Properties, and selected the URLs tab. I clicked Add in the Mappings section and added

//LiveServer

in the Access location box and

http://LiveServer

in the Display location box. Then all I had to do was start a new build on the catalog. When the build completed, it displayed the URLs for the parameters that the user sees. Figure 1 shows the process of gathering, indexing, and displaying the search results. This process lets SearchServer perform normal file crawls behind a firewall with whatever permissions the Index Server has, but still lets you display the files to the end user from LiveServer in proper Web format.

By setting up SearchServer, I removed the strain Site Server's indexing put on LiveServer's resources and required to create and maintain a catalog. Also, I can now manage all my catalogs for numerous Web servers from one machine.

When you go to the http://servername/siteserver/knowledge/search/default.htm, you can run a keyword search against all your catalogs at once or individually. For example, if you build a couple catalogs from different sources, you can provide one interface to all of them, or you can separate each type of content into their appropriate areas.

Site Server's search capabilities have changed a lot since Site Server 2.0. Site Server Search is powerful, easy to implement, and a tremendous service to your users. Most important, Site Server Search provides enough functionality to justify buying Site Server 3.0.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here