Académique Documents
Professionnel Documents
Culture Documents
Disclaimer
This document and the software described in this document are furnished under and are subject to the terms of a license agreement or a non-disclosure agreement. Except as expressly set forth in such license agreement or non-disclosure agreement, NetIQ Corporation provides this document and the software described in this document as is without warranty of any kind, either express or implied, including, but not limited to, the implied warranties of merchantability or fitness for a particular purpose. Some states do not allow disclaimers of express or implied warranties in certain transactions; therefore, this statement may not apply to you. This document and the software described in this document may not be lent, sold, or given away without the prior written permission of NetIQ Corporation, except as otherwise permitted by law. Except as expressly set forth in such license agreement or non-disclosure agreement, no part of this document or the software described in this document may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, or otherwise, without the prior written consent of NetIQ Corporation. Some companies, names, and data in this document are used for illustration purposes and may not represent real companies, individuals, or data. This document could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein. These changes may be incorporated in new editions of this document. NetIQ Corporation may make improvements in or changes to the software described in this document at any time. 1996-2004 NetIQ Corporation. All rights reserved. U.S. Government Restricted Rights: If the software and documentation are being acquired by or on behalf of the U.S. Government or by a U.S. Government prime contractor or subcontractor (at any tier), in accordance with 48 C.F.R. 227.7202-4 (for Department of Defense (DOD) acquisitions) and 48 C.F.R. 2.101 and 12.212 (for non-DOD acquisitions), the governments rights in the software and documentation, including its rights to use, modify, reproduce, release, perform, display or disclose the software or documentation, will be subject in all respects to the commercial license rights and restrictions provided in the license agreement.
Trademarks
WebTrends is a registered trademark of NetIQ Corporation. Additional trademarks of NetIQ Corporation include: FastTrends, WebTrends SmartView, WebTrends Report Exporter, GeoTrends, WebTrends Express Viewer, WebTrends SmartSource Data Collector, WebTrends SmartReports, WebTrends On Demand, WebTrends Tech Tools, Log Analyzer, WebTrends Live, and WebTrends Reporting Center. Other brands and their products are trademarks or registered trademarks of their respective holders.
Support
Sales and General Contact Information NetIQ Corporation 3553 N. First St. San Jose, CA 95134 Phone: 1-408-856-3000 Fax: 1-408-273-0578 Sales: 1-888-323-6768 Email: info@netiq.com Service and Support Online Resources
Direct Technical Support: Americas: +1 503-223-3023 Asia Pacific, Australia, New Zealand: +1 503-223-3023 Europe, Middle East, Africa: +353 (0) 91 -782 677 http://www.netiq.com/ support
Customer Resource Center. A portal to resources that can help you make the most of your on-line web initiatives. http://www.netiq.com/webtrends/resourcecenters.asp
WebTrends Portland, Oregon 851 SW 6th Ave. Suite 700 Portland OR 97204 Phone: 1-503-294-7025 Fax: 1-503-294-7130 US Toll Free: 1-888-932-8736 Email: info@webtrends.com Web:http://www.webtrends.com
iii
Table of Contents
Chapter 1
Chapter 2
Commerce sites ...................................................................................................................33 Lead-generation sites ..........................................................................................................34 Self-service sites ...................................................................................................................36 Intranet sites .........................................................................................................................37 Branding sites .......................................................................................................................37 Summary ............................................................................................................................................ 38 Objectives and Critical Metrics Worksheet ................................................................................... 39
Chapter 3
Using client-side tagging .................................................................................................... 49 Combining web server logs and client-side tagging ...................................................... 52 Hosted Versus Installed Software Solutions ..................................................................................52 Choosing a Data Collection Method ..............................................................................................53 Data Collection Worksheet ..............................................................................................................54 Chapter 4
Visitor Identification...............................................................................57
Defining Web Activity ......................................................................................................................57 Determining Unique Visitors ...........................................................................................................59 Sessionizing Your Visits ....................................................................................................................59 Visitor Identifiers ...............................................................................................................................61 Client IP address or domain name ................................................................................... 62
Combination of IP address and agent information ....................................................... 63 Cookies ................................................................................................................................. 64 Session IDs or IDs embedded in URLs .......................................................................... 67 Authenticated username .................................................................................................... 68 Summary ..............................................................................................................................................70 Finding the Features in WebTrends Products ..............................................................................71 Visitor Identification Worksheet .....................................................................................................72 Chapter 5
vi
Other site structure issues ..................................................................................................87 Summary ............................................................................................................................................. 90 Finding the Features in WebTrends Products .............................................................................. 91 Defining Behaviors Worksheet ....................................................................................................... 92 Chapter 6
Visits ......................................................................................................................................95 Hit filter criteria ...................................................................................................................96 Visit filter criteria .............................................................................................................. 104 Handling Multiple Filters ............................................................................................................... 108 Data aggregation ............................................................................................................... 109 Table filtering .................................................................................................................... 110 Custom Reports ............................................................................................................................... 112 Parent-child profilesa structural alternative to custom reports and/or filters ... 115 Summary ........................................................................................................................................... 116 Finding the Features in WebTrends Products ............................................................................ 116 Filtering Worksheet ........................................................................................................................ 117 Chapter 7
vii
Search engines ...................................................................................................................130 Email marketing ................................................................................................................134 Summary ........................................................................................................................................... 136 Finding the Features in WebTrends Products ........................................................................... 136 Acquisition Metrics Worksheet ..................................................................................................... 137 Chapter 8
Scenario analysis ................................................................................................................147 Internal Search ................................................................................................................................. 152 Exit Page and Exit Ratio Analysis ................................................................................................ 152 Visit-to-exit ratio ...............................................................................................................153 Dead-End Paths .............................................................................................................................. 154 Gleaning Demographic Information Through Registration Forms ....................................... 154 Evaluating Visitor Behavior by Browsing Your Site ................................................................. 156 Summary ........................................................................................................................................... 157 Finding the Features in WebTrends Products ........................................................................... 157 Conversion Worksheet ................................................................................................................... 158 Chapter 9
viii
Chapter 10
Reporting from a web data warehouse ......................................................................... 175 Deeper Reporting and Exploration Using Excel ....................................................................... 176 Drill Down capability ...................................................................................................... 177 Working with dimensions and measures ...................................................................... 179 Overhead and monetary costs ........................................................................................ 183 Using reports for continuous improvement ................................................................ 184 Data Integration and Exploration Worksheet ............................................................................ 185 Chapter 11
Storage and performance issues ..................................................................................... 189 Performance issues .......................................................................................................... 196 Finding the Features in WebTrends Products ............................................................................ 198 Optimizing Worksheet ................................................................................................................... 199
ix
Chapter 1
11
As an additional benefit, worksheets with pertinent questions are provided at the end of Chapters 2 through 11 to help you in your quest to find the right web analytic solution. Also, please consult the Glossary on page 201 for a brief explanation of many terms used in this book.
13
To a web content developer, web analysis is discovering traffic patterns that influence his or her design improvements. To a sales person, web analysis is tracking which individual customers and prospects have been visiting on the web site in order to narrow the sales approach for a given customer or prospect. Yet these perspectives are actually the applied definition of web analysis. The mechanics of web analysis are a little different. From a mechanics perspective, web analysis is a three step process in which you: 1. Collect web activity data. 2. Analyze the data interests you. 3. Create meaningful reports on that data. The catch is that you can accomplish these three steps in many different ways. In the end though, each method arrives at a similar placereports that help you determine whether your web site or a part of your web site is meeting its objective. But why is web analysis so frequently misunderstood? According to a Forrester Research report, only 23 percent of companies use web analysis to improve their online operations. The reason for this low turnout in the web analysis field is most likely because the basic concepts of web analysis and its implementation have never been fully discussed. Web analysis is often viewed as black magic that only a few, gifted individuals know how to perform. In fact, many organizations have web analysis applications but experience so much frustration when using them that they abandon them altogether. Still other organizations find that the solutions they chose are either not comprehensive enough or are too comprehensive for their needs.
15
Figure 1-2 shows an overview of how this guide relates with your overall web site strategy.
As part of your web site strategy, you need to identify the following: The primary goals of your organization The primary goals of your site Goals of individual sections of the site Successful visit profiles The drivers to successful visits
17
Figure 1-3. The Measurable Improvement Cycle By applying this process to all web site decisions, it will help you to focus your benchmarks and make critical adjustments to your web site, helping you to improve each time you complete the cycle.
Stage 1: Report
Report on the key metrics for each of your sites objectives: Define the measurements you need. Configure your analysis solution and web site as per your measurements. Process and assemble sites raw data into analysis reports.
Stage 2: Analyze
Use WebTrends to determine the performance of key metrics and site goals. Analysis in the form of reports allows you to: Set baseline performance. Evaluate the impact of site changes.
Stage 3: Decide
Determine what to do based on what the measurements tell you. Decisions might involve: Changing your web site. Altering marketing efforts. Revising content strategy. Updating your business model.
Stage 4: Act
Armed with the tables and graphs of your reports, you can optimize your site to improve performance of key metrics. Change your web sites pages according to your data. For example, you might tweak the steps in the shopping cart scenario. Remember that small incremental improvements are the goal. Try A/B testing. On the web this means that you are sending 50% of your traffic to one page and 50% of your traffic to another page. However, A/B testing may result in a reduction of the desired action that you want from your visitorssuch as registering or purchasing. Filter x% of traffic to test against as an alternative to A/B testing. Just divert a small percentage of visitors to the alternate web page that you want to test. This may allow you to gather more accurate testing results. Perform usability testing on the changes you made to your web site.
19
Ongoing process
You will experience more success as you keep with the improvement cycle. Effective incremental changes involve a process rather than an end-result. Sometimes you may need only to change one or two things before you do another analysis. Incrementally refining your changes might help you more than making wholesale alterations.
One typical problem that scenario analysis helps to identify is when shipping information is only available within the checkout process. In such cases, youll see a high number of abandonment on the page showing the shipping charge. These abandonments are from customers who are simply browsing and want to compare shipping charges with the competition. See Scenario analysis on page 147 for more information. Filters Filtering allows you to understand which segments of people are looking at your products and buying them. See Chapter 6 Filtering and Analyzing Your Data on page 93 for more information.
21
newsletter. See Path analysis on page 142 for more information. Parameter Analysis If you allow visitors to sign up for a variety of new topics (like a graduated opt-in), you could use parameter analysis to report on topics in which visitors are most interested. Additionally, you could correlate those topics of interest with other web site activity, such as Content Groups. Scenario analysis If you have a specific set of steps that you want your visitors to take, and one of those steps (such as in a checkout sequence) offers the visitor an opportunity to sign up for your newsletter, then you will want to use scenario analysis to determine if the offer is placed in the correct step of the sequence. If people abandon the sequence at the point in which they should sign up for your newsletter, then perhaps the web page needs to be designed differently. See Scenario analysis on page 83 for more information. Content groups You might want to find out what product or set of products have been visited the most over the past few months and then make that product or product set a centerpiece of an upcoming newsletter. To find out how groups of products are faring, youll use a concept called content groups. See Content groups on page 77 for more information.
23
See Search engines on page 130 for more information. Ad campaigns If you set up an ad campaignwhich is tied to a specific search engineas a referrer, landing page, or landing page parameter, you can examine how effective that campaign is. This could help you to determine which paid search engines are most effective. Which ad campaigns are the most successful and least successful? You might also want to evaluate the quality of the traffic that the ad campaign generated. Did various conversions occur? Did visitors spend a lot of time on the site? How many calls to action have been followed? See Ad campaigns on page 126 for more information. Spider and robot report You can determine how much of your raw traffic is attributed to spiders, which ones are indexing your site, and how deep in your site they are going. Spiders and robots are automated programs that crawl through the Internet to collect and index information, usually on behalf of a search engine or a monitoring company. You can use the report analysis to block spiders and robots from your web site.
Path analysis Compare the navigation of visitors who purchase products to those who do not, and then fine tune your web site according to what youve learned. See Path analysis on page 142 for more information.
25
Chapter 2
27
Increase visitor satisfaction - making site more convenient and valuable to visitors Decrease acquisition costs Increase conversion rates Improve customer/visitor retention Increase your web ROI. However, since no two web sites are alike, each site can have individually tailored objectives. Table 2-1 identifies several types of web sites and some corresponding objectives. Table 2-1. Site Objectives
Site Objective Commerce Business Goal Increase sales and generate revenue Complement offline channels Increase average order size Visitor Goal Research products Buy products Web Analysis Focus Buying & research behavior Obstacles to purchase Visitor-to-buyer conversion Abandonment analysis Campaign effectiveness Purchase drivers Research behavior Visitor-to-lead ratio Lead quality & cost Campaign effectiveness Call to action optimization Info-seeking behaviors Ease of use and success Electronic vs. traditional costs Other supporting goals Ad tracking, sponsorships, etc.
Lead Generation
Research products/ services Collect more information Contact a representative Find information Conduct research
Informational
Business Goal Develop audience loyalty Monetize through ads or commerce Brand building
Web Analysis Focus Frequency, depth, and length of visits Popular audience interests for targeting/ segmenting Ad tracking, sponsorships, etc. Conversion from entertainment visits to other revenue or branding behavior visits Frequency and quality of visits (are they an ad clicker?) Advertising revenue generated Visitor interest in content and preferences for segmentation Audience growth, loyalty, engagement
Generate revenue through ads, referrals, paid search placements, visitor services Build loyalty Increase page views per visit Increase visit frequency Subscriptions to magazine, newspaper, and online publications Provide service online Reduce service costs Speed resolution rate Offer problem resolution Offer knowledge base information
Customer SelfService
Visit frequency and duration Issue resolution rate Tracking of email inquiries after reviewing help pages Most successful type of help content/pages
29
Business Goal Leverage Knowledge Base Streamline operations Provide access to critical applications
Web Analysis Focus Visit frequency and duration Most popular content/ pages Completion of a series of steps (scenario) Visitor/departmentlevel activity.
Of course, most sites have multiple objectives and consequently fall into several of the above categories. Businesses generally focus on more than just one task. For example, a company selling products will be concerned about customer service and lead generation for higher-end products. Also, large companies with multiple divisions may share portions of a web site and have numerous objectives. The message is clear: you must look at the chief characteristics of your web site. What does your web site do? What are the handful of metrics that will tell you that you are successful?
A customer self-service web site may be interested in the percentages of visitors that 1. Log in to members page. 2. Visit various pages with pertinent topics. 3. Print or download information. 4. Log out. By measuring the visitors in each step of a scenario, you can determine where in the process you are losing the most people and then take action to improve the situation. The following subsections discuss metrics for several general web sites. The vast majority of web sites represent a combination of the following five business models, as shown in Figure 2-1.
31
Content sites
Content sites refer to media sites and specialty portals that are supported by sponsors and ads, subscriptions, premium services, and other means. Examples are Yahoo, CNN.com, Salon.com, and Consumer Reports. Content sites are typically interested in the following metrics:
Average page views per visit Content sites desire an increasing amount of pages views per visit. By examining this metric in relation to content groups, you may gain more perspective on what areas are generating the most interest. Average visits per visitor How often are visitors returning each day, week, or month? This is an important metric that may indicate the success of a particular campaign. Clickthroughs of onsite ads Since many content sites are supported through advertising, monitoring the number of clickthroughs of these ads help you gauge the value of the ad. First-time versus returning visitors Does the content effectively engage visitors enough to make them return? By tracking the ratio between new and return visits over a period of time, you can determine if your site is attracting enough returning visitors. Average visit frequency and recency You will want frequency to be high and recency to be low to retain and grow your audience. Content group activity and history metrics If a content group experiences fewer and fewer visits, then you can investigate and take action. Number of search engine referrals The number of visits referred by search engines is usually a critical metric for most content sites. Specialized conversion rates Conversion rates typically explore how many visitors move from one step to the next in a scenario that you are monitoring. Media sites may want visitors to
register for topical newsletters to increase ad revenues and drive repeat traffic to the site.
Commerce sites
Commerce sites are sites where companies sell their products and services. Examples are Amazon.com, WalMart, Converse, and Diamond.com. Commerce sites are typically interested in the following metrics:
Gross margin Companies with high gross margins (gross revenue less cost of goods) have more money to spend on business operations such as research and development. Gross margin return on Investment (GMROI) GMROI is Gross Margin divided by demand creation expense for that order. That is, Gross Margin dollars are divided by the cost of the demand creation activity that drove the sale. This comes from being able to track the most recent campaign. Net profit Represents the gross revenue minus taxes, interest, depreciation, cost of goods sold, and other expenses. Total sales Represents the total invoice value of sales, before deducting for customer discounts, allowances, or returns. Average order size Represents gross sales divided by the number of ordersthis reveals the average amount spent on each order. The higher the average amount, the better you are at motivating buyers to purchase more. Accessory attachment rate This the overall rate at which accessories are added to an order. This is the measurement of the number of orders which have an accessory attached to the order, divided by the total number of orders. This measurement determines how to grow the overall average order size, as well as growing the gross margin/profit of a single order. Accessories typically have the highest gross margin on a site and significantly increase the profitability of an order. For example, the cables on
33
a DVD Player order may have as much profit dollars as the player. Sales conversion ratio Represents the ratio of visitors to sales and visits to sales. Customer retention rate Represents the number of repeat customers divided by the number of total customers over a period of time. Commerce sites strive for repeat business. Cost per sale Represents marketing expenses divided by the number of sales during a period of time. Low cost per sale means efficient marketing and a higher net profit. Customer acquisition cost This is marketing expenses divided by the total number of orders from unique, first-time buyers over a period of time. If it costs a lot to acquire new customers, then you may have to retool your marketing effort. Average lifetime value What is the value of your customers over a period of time. Is it increasing? Specialized conversion rates Conversion rates typically explore how many visitors move from one step to the next in a scenario that you are monitoring. An example of a specialized conversion rate for a commerce site: your site invites visitors to register for a newsletter or sign up for a contest. Compare how many visitors see the offer with how many actually sign up.
Lead-generation sites
Lead-generation sites offer information for sales processes by actively capturing visitors as leads. This usually occurs after visitors register or contact a sales representative. Examples include B-to-C web sites such as autos and homes, and Business-to-Business (B2B) web sites such as Siebel, Peoplesoft, and Boeing.
Visitor-to-lead conversion ratio This represents the percent of visitors that register or otherwise become a lead over a period of time. If this metric dips or peaks, you should evaluate conversion rates by acquisition source (campaigns). Total number of leads If the number of leads does not grow, then a site may need to be re-evaluated. Consider examining the number of leads from search engines, campaigns, partners, or the number of leads for different products or from a geographic region. Cost per lead Represents marketing expenses divided by the number of leads generated during a period of time. This metric contributes to understanding the cost of marketing campaigns and collateral. Lead close ratio This is the percentage of collected leads that ended up closing as a sale. If leads are closed through channels other than your web site, you may have to track lead closure manually. Average visits or page views per visitor If your site is seen as a resource, it may attract more leads that value the content. Marketing campaign conversion rate This is the general effectiveness of campaigns at driving visitors to register as leads. Specialized conversion rates Conversion rates typically explore how many visitors move from one step to the next in a scenario that you are monitoring. An example of a specialized conversion rate for a lead-generation site: your site wants to evaluate which methods (such as a newsletter or a webcast) lead to the highest closure rates.
35
Self-service sites
Self-service sites focus on helping customers resolve issues and/or learn about uses of the product or service without the aid of human interaction. Self-service sites are often a component of another model but can stand alone. Examples are support/knowledge base sites of most manufacturers and software developers, and online banking. Self-service sites are typically interested in the following metrics:
Average visits per visitor An increase or decrease of average visits per visit may be seen as positive or negative, depending on the sites objectives. On the one hand, an increase is good for a governmental web site or an intranet maintained for employees, because it shows that visitors are performing many tasks, such as scheduling vacations, reading corporate policies, or checking on 401K plans. On the other hand, a software manufacturer may want the visits per visitor to decrease, indicating that people are finding what they need quickly. Average page views per visit The same considerations apply here as with visits per visitor. Compare average page views per visit with content groups to know whether a decrease or increase in activity is good or bad. Knowledgebase searches per visit How easy is it for visitors to find the information they want? If some knowledgebase articles are searched quite often, you may have to put better explanations into your product. Number of zero result queries This represents how often a visitor searches on a term and receives zero search results. You need to add new content if visitors received zero results after querying the same or similar keywords. Online resolution rate This rate is the percentage of site visits that resolve issues online versus those that need additional help over the phone or email. Percentage of total support requests handled online This information helps to identify which support options visitors are using and to what degree. If a certain option gets more attention than others, then you might consider upgrading the corresponding part of your product.
Specialized conversion rates Conversion rates typically explore how many visitors move from one step to the next in a scenario that is being monitored. An example of a specialized conversion rate for a self-service site: a cellular company might want to allow its customers to edit their general account information, modify their calling plans, or download new ring tones.
Intranet sites
Intranet sites are primarily company or organization sites that provide service for employees. Employees typically use intranet sites to schedule vacation, to download and print medical forms, to check up on company policies, and a variety of other tasks. Intranet sites have a lot of the same issues as self-service sites except that you know your total number of visitors (the employees). Therefore, the resulting reports will accurately reflect usage in relationship to a known number of visitors. Intranet sites would use the same metrics as the self-service sites. For example, by using scenario analysis you could look at the steps in a process such as filling out a vacation request form. Perhaps you would find that some employees abandon the process at a certain step because they are still unsure about their vacation plans. This would be similar to the steps explored in the Specialized Conversion Rate mentioned in the metrics for self-service.
Branding sites
Branding sites are those that seek to promote interaction with visitors and engage them with a brand. Sponsored by companies, initiatives, and/or events, branding sites intend to generate buzz, interest in a product/company, or stimulate sales. Note that these sites do not justify their existence on sales/leads generated or ad revenue. Examples of branding sites are absolut.com, movie sites, and Coca-Cola. Branding sites are typically interested in the following metrics:
Unique visitors Monitoring unique visitors by day, week, month, quarter, and year helps to evaluate the effectiveness of your online branding. Depth of exploration This includes measures such as average page view per visit, length of time, and content group exposure. When tied to a campaign, you can find out to what
37
depth that campaign affected visitors. Repeat/returning visitors Successful branding sites attract multiple, continuous interactions with visitors. Average visit frequency, recency, and latency by content area visited These measurements continue the concept of sustained interaction with visitors. Loyal visitors, for example, are the ones that typically purchase more products. Specialized conversion rates The rate at which visitors play games, download coupons or screen savers, enter contests, etc., and then register with your site is very important.
Summary
After your company has firmly determined the objectives for its web site and determined which specific metrics to track, you can use WebTrends to get the reports that you need. These reports will influence the way you change your web site. You might, for example, improve the content in a sequence of steps that leads to the purchase of an item. In most cases, it is best to make small, incremental changes to your web site. You can then direct WebTrends to measure your visitors and get a new set of results to study. Of course, after youve made your changes, you may need to re-examine your sites goals and objectives, and then add a new set of measurements. This is part of the continuous Measurable Improvement Cycle that was discussed in Chapter 1 on page 18. To help you think through the objectives and critical metrics of your web site, you can refer to the Objectives and Critical Metrics Worksheet on page 39. To begin understanding how to collect the data that you will explore with web analytics, continue to the next chapter, Chapter 3, Collecting Your Web Activity Data on page 41.
Consideration What are the high-level goals of your web site? What would a successful visit to your web site be? What business model is your site? (Commerce, Content, Self-Service Lead Generation or Branding/ Campaign) How would you improve your web site?
Comments
What are more specific objectives for your web site? Business goals Visitor goals What do you need to measure to improve your site?
39
Chapter 3
41
them to obtain IT-based metrics such as spiders, downloads, bandwidth, and errors. Clientside tagging can help them to get business metrics such as screen resolution and java enabled browsers. There are many other differences in the data collected by these two methods that may or may not be relevant to your analytics needs, and these differences are discussed in the irrespective subsections.
42
Any query parameters, if additional information is needed Not required but strongly recommended. Used for analyzing dynamic content. The return codesuccessful or failed delivery of the request Not required. Used for reporting on user and system errors. The number of bytes sent by the web server to the client Not required. Used for reporting on bandwidth usage. The number of bytes sent by the client to the web server Not required. Used to report on the amount of data sent from visitors to the website. The amount of time (in milliseconds) to fulfill the request Not required, but if present, this is used for reports involving server response time. The port on the client machine used to send requests and receive the requested data Not required. Not generally used. The client machines browser type and version number (also know as the agent) Not required. This is used for determining which browsers are in use, and for recognizing various types of spiders and search engine robots. Cookie information, if the client machine has a cookie for your site Not required, though very useful for tracking unique visitors. Also, cookies can contain other, site-specific information, which can be analyzed and reported on. Referrer information, if the visitor was sent to your site from an external site Not required. Used for recognizing how visitors arrived at your site, especially via search engines. Note: This logged information and the order in which it appears has been specified by the software contained in the web server that keeps the log files. For Microsoft systems, the software is called Internet Information Services (IIS). You can program the software to reorder or drop pieces of information that you might find unnecessary, but it is best to do this only after youve gained some expertise with web analytics. Each log entry appears as information on one very long line in the file. The following sample log entry has been split over several lines so that you can read it more easily:
43
2002-09-16 00:01:58 65.70.31.3 W3SVC82 HERC 209.224.1.170 GET /products/thingamajigger.html 200 4199 363 266 80 HTTP/1.0 Mozilla/4.72+[en]C-SBI-NC472++(Windows+NT+5.0;+U) WEBTRENDS_ID=192.168.32.180-3425858080.29527895 http://www.awebsite.com/thingamajiggerad.html
Figure 3-1 explains this log entry by relating each bulleted item above to the corresponding information in the sample log entry.
Your log file can vary from this example, because you can configure your server to include the information you want. Also, the information available may vary according to the brand of server software (for example, IIS, iPlanet, or Apache). Please refer to the server softwares documentation for directions on how to activate logging. Note that if in IIS you enable logging for Process Accounting, you may cause a lot of unnecessary headaches. Note: For a more complete sample of a log file according to the format provided by Microsoft IIS versions 4 and 5, see NetIQs Knowledge Base article NETIQKB2382 (www.netiq.com/ kb/esupport/consumer/esupport.asp?id=NETIQKB2382).
44
In cases in which odd URLs have been produced by some content management systems, you may need programmers who can write scripts (that is, special code in a language such as Perl) to preprocess log files before giving them to WebTrends software. Note: WebTrends offers a built-in method, called conduit scripting, which can be used to massage log files from content management systems such as Vignette, BroadVision, and Macromedia Spectra.
45
46
47
How frequently you import the log files depends on how much activity your site experiences. As a general rule, most sites bring over their log files once a day. However, if your site has high levels of activity and generates extremely large log files, you may need to transfer files more frequently. This reduces the data volume that must be handled at any given time. WebTrends is designed to recognize which files have already been imported, and only brings in files that contain new data. In comparison, accessing your log files from a network drive is a more familiar way of obtaining your log file data because WebTrends treats it as though the log files were stored locally. Dont be fooled though, because in reality the data still needs to come across the network from the mapped drive. This data transfer greatly slows the entire analysis process. Note: One weeks worth of log file data will give you a snapshot of the volumes of activity on a site, but you will probably need three months worth of data to get a real insight into the trends. Once you understand the trends, then spikes and anomalies become evident and usually their cause can be traced and evaluated.
48
Corrupt log files If the log file is there, but WebTrends cannot read it, then the log file might be corrupt. Missing log files Are you sure that they are not written elsewhere on the system? Log file hell If the web site is hosted on geographically disperse servers, WebTrends has to collect all the log files in one place and have a means of ordering the records from all the log files. It must then determine which hits are part of the same visit. If time stamps on the various web server logs are not in sync, results can be inaccurate. You must also have a way to handle server disruption, or the results can be inaccurate. Log files cant record repeat requests when a page is accessed from a caching server. Inaccurate information because of proxy servers and content delivery networks, such as AOL, AT&T, and Earthlink. (See Proxy server buffers on page 63 for more information.) Depending on the level of sophistication, the software installation and configuration may take time. The learning curve for this software is sharp and steep. You must maintain the equipment and software yourselfunless an ISP does this for you. You must write scripts (or purchase software containing ready-made scripts) to handle odd URLs that may need more processing to understand correctly.
49
The key to data collection is in the HTTP request, which is a transparent 1 pixel by 1 pixel image. In reality, the image request is just a transport vehicle for the variable, which contains the visit information. The information in the variable gets transported to the data collection server in the request. At the data collection server, the information in the variable is used to add a new record to a web activity file that you can use for web site analysis. Figure 3-4 shows a typical client-side tagging process.
Here are the basic steps of the tagging process: 1. A visitor wants to view a page on your site. This initiates a page request to your web server. 2. Your server sends the page to the visitor, and this page contains a JavaScript tag. 3. The tag triggers a request for a GIF with parameters attached. 4. The GIF file is sent to the visitor. 5. The request with the parameters is analyzed. The tagging method can actually be hosted externally, or you may end up hosting it onsite. Typically, if you want deeper analysis capabilities, you would handle the data collection internally to keep the data on hand. Most external hosting companies do not hold your data for an extended period, they simply offer you standard reports on summary web activity data. The tags put information into a dedicated data file for analysis. A typical data-file record might look like this:
50
2001-03-04 00:08:18 proxy7.hotmail.com W3SVC3 web1 192.168.1.1 GET /ads/default.asp redir=products&ad=http%3A// www.boatdealer.com&WT.mc_n=Boat%20Dealer%20Campaign&WT.mc_t=Banner &WT.mc_s=3/3/2001&WT.mc_c=60&WT.ad=P-32,%20P-58,%20P72%20Options%20Offer&WT.sv=Web%20Server%201&WT.ti=Advertising%20Re direct&WT.tz=420&WT.ul=en&WT.cd=32&WT.sr=1024x768&WT.jo=Yes&WT.js= Yes&WT.co=Yes 200 0 1 75 1 80 HTTP/1.1 Microsoft+Internet+Explorer/ 4.40.305beta+(Windows+95) WEBTRENDS_ID=192.168.16.1481615253808.29527727 http://www.boatdealer.com/dealers/pacific/ dealerlist.htm
The italicized text contains client-side tagging parameters, which were used to fetch the data from a database that populated the web page template, default.asp. Note that the increasing amount of information gathered for each record may quickly fill your SDC server. Therefore, this server must be monitored closely. You may need to transfer the data files to another server, as discussed in Log file rotation/rollover on page 45. Note: For its tags, WebTrends has developed special parameters called WebTrends SmartSource Parameters. In the above example, all WebTrends SmartSource Parameters begin with WT.
51
52
Using a hosting service is an attractive option for several reasons. The foremost reason is that you dont have to maintain the web analysis software or hardware, and you can write off the service as an operating expense. Also, a hosting service arrangement doesnt require the additional setup time that complex software solutions require. If you dont like the service, you can easily cancel or finish out the contract and disable the data collection. In contrast, installed software (non-hosted) solutions provide greater flexibility regarding the data you can analyze and in the way you can present that data. With data collected from web server data filesthe most common kind of non-hosted solutionyou can store web activity data indefinitely in raw log file format or processed in a web data warehouse. This means that at any time you can re-analyze the data, combine it with external data sources, or run deeper analyses using third party software. Another key advantage of installed software is privacy, because you control the data, which is never stored on a third party server. Privacy is especially important for financial industries, such as banking and insurance. The main drawback of installed software is that you must maintain the software and hardware associated with your analysis solution. For this reason, the expenses are viewed by accounting as company assets, which are only depreciable and not deductible. Traditionally, the client-side tagging model has been primarily used as a hosted solution with products such as WebTrends On Demand and web server data file analysis has most often been used with software (non-hosted) solutions. However, with the advent of data collection servers, organizations can now use client-side tagging to collect activity data themselves (as a non-hosted solution) and either report on that data directly, or store the data in a web data warehouse.
53
Some organizations choose to combine both web server log and client-side tagging methods. They generate standard reports using client-side tags or .asp model, but collect and store web server log data to allow flexibility later on. In the future, many organizations will probably find that using non-hosted client-side tagging solution with a data collection server may be more attractive than using web server logs. They will be able to collect and store the same information that web server logs can, allowing more in-depth and flexible analysis and reporting, yet also offering immediate report generation on standard data.
Consideration Need access to log files? (Note: Hosted services dont allow access to log files.) Need to keep data for an extended period of time to do comparisons? Capture information on all downloads (HTML and non-HTML files)? Use multiple or co-located servers? All servers are available at all times? Can afford up-front investment in terms of capital and training time?
Yes
No
Comments
54
Consideration Can maintain additional hardware equipment and software? Need to write off costs as an operating expense? Capture data of only specific web pages? Quick install/uninstall? Can afford extra hardware? Can embed code in each page to be tracked (also redirect pages)? Only care about HTML pages and business metrics (dont care about IT-based metrics)? Prepared for software costs related to licensing? Have the people (IT) resources? Know what kinds of information needed (business and/or IT)? Have the storage retention/ space (time/how long)?
Yes
No
Comments
55
56
Chapter 4
Visitor Identification
The main objective of web analysis is to understand how web visitors are using your site (what pages are visited and what actions are taken) so that you can determine if they are doing what you want them to do. Are visitors responding to ads? Are visitors making purchases? Are visitors reviewing your technical support materials rather than calling your technical support personnel? These are questions that you can answer by using WebTrends. Your web activity data file, whether generated by the web server itself or collected and created by a data collection server, can tell you more about the activity on your site. But how can you tie activity to individual visitors? How can you tell whether a hit to a product information page and a hit to the pages of a shopping cart were all done by the same visitor? If you knew that, you could say that a particular visitor read the products description, decided to purchase it online, and then completed all the steps required for making a purchase. Tracking visitor activity can be quite complex, so it is important to keep in mind that you will spend more time, effort, and resources as you strive for more clarity and accuracy in understanding who your visitors are.
Visitor Identification
57
customers. From a low-level, you will want to know the definitions of several terms that are commonly used when discussing web activity. Hit Represents any individual item that is delivered from the server to the client. A single visitor action could result in dozens of hits. For example, when a web page is delivered to a clients screen, it may arrive with graphics, icons, flashing ads, sidebars with links, frames, and other items that all count as hits. While the volume of hits is an indicator of web server traffic, it in not an accurate reflection of how much real information is being looked at. Important: Hit is one of the most misunderstood terms in web analytics. Please take time to understand this term rather than assume that you already know what it means. Page View A hit to any file classified as a page (such as, html, htm, psp, and asp pages). Note: For sites still using frames, an actual page viewed may consist of several HTML documents. Visit Denotes a sequence of a visitors hits up until the point in which the gap between two successive hits is greater than the defined timeout session length (usually thirty minutes). Much marketing research focuses on statistics for visitor sessions for a more accurate picture of user activity, multiple requests can be made within a single visitor session.Visits are equal to sessions, which is explained in more detail in Sessionizing Your Visits on page 59. Note: If you modify the session timeout length, you will get a different session visit count. For example, shortening the timeout length will increase the count in the number of visits. The payoff in your analysis of the web activity is in finding the visitor. Visitor Represents the person or agent that generates the visits. Agent indicates a program, such as a robot or spider that is used to visit web sites.
Visitor Identification
59
2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01 2002-01-01
00:40:46 00:41:22 00:44:00 00:44:17 00:46:13 00:48:24 00:59:59 01:01:13 01:03:02 01:04:40 01:06:32 01:09:01 01:09:18 01:10:51 01:11:30 01:12:22 01:14:48 01:17:06 00:29:59 01:19:52 03:19:59 03:21:02 03:23:29 03:25:34 03:33:55 03:39:59 03:43:08 03:59:59 04:00:00
24.166.12.188 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET 165.91.171.109 - W3SVC3 HERC 192.168.1.1 GET 24.166.12.188 - W3SVC3 HERC 192.168.1.1 GET 66.67.2.10 - W3SVC3 HERC 192.168.1.1 POST 66.67.2.10 - W3SVC3 HERC 192.168.1.1 POST 206.213.251.31 - W3SVC3 HERC 192.168.1.1 GET 38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET 206.213.251.31 - W3SVC3 HERC 192.168.1.1 GET 206.213.251.31 - W3SVC3 HERC 192.168.1.1 GET 38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET 12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET 38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET 12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET 24.166.12.188 - W3SVC3 HERC 192.168.1.1 GET 12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET 38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET 12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET 12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET 192.11.223.116 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET 63.232.193.82 - W3SVC3 HERC 192.168.1.1 GET 24.140.30.88 - W3SVC3 HERC 192.168.1.1 GET 217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
If you look at the activity of 217.194.141.67 (remember that this is a visitors IP address), you will notice that it has two sessions, which are separated by a gap of at least thirty minutes. Figure 4-1 shows the two sessions:
In general, sessionizing requires two basic elements: A time stamp, to determine the beginning and end of a visitor's session and to order hits in a time sequence A visitor identifier that ties each hit in the web data activity file to the web visitor responsible for the hit The time stamp requirement is easily handled because web servers and data collection servers can add a time stamp to any hit recorded in a web data activity file. As long as Greenwich Mean Time (GMT) is used to indicate the time, servers that are located in different time zones will not have any problem understanding the time sequence of the data. The more complicated requirement is the visitor identifier.
Visitor Identifiers
You have several different methods at your disposal for identifying the visitor associated with web site activity. These methods include: Client IP address or domain name Combination of IP address and agent information Cookie (persistent or session-only) Session IDs Data embedded in the URL Authenticated user These methods are listed in order of increasing accuracy. The order also corresponds with the complexity of your site management. At the very minimum, you can examine the clients IP addresses. The next best thing is the combination of IP and agent, but the very best method is authenticated users. In other words, the IP address is easy to identify while the authentication of users is much more difficult. Though each method has its strengths and weaknesses, you may encounter such issues as: The ambiguity of the visitor identifier If two visitors can have the same identifier at the same time, they will appear as a single visit by the same visitor. The problem with aliasing of a visitor identifier within a single session If a single visitor has more than one identifier (for example, an alias) within a
Visitor Identification
61
session, that visitor will appear to be multiple visitors, each having its own visit session. The problem with the persistence of the identifier across multiple sessions If a single actual visitor has two different identifiers from one session to the next, that visitor will appear to be two separate visitors. This causes an inaccurate count of unique visitors and new versus returning visitors. It also doesnt allow you to accurately accumulate a single visitor's activity over the lifetime of that visitor. As we discuss the various methods for identifying visitors, you will recognize how each method has one or more of these three issues to contend with.
Computer usage
And similar to the problems mentioned in the cookies section, when multiple users visit your site from the same machine, or when a single user visits your site from more than one computer, associating visitors to web activity via a computers IP address cannot be done accurately.
Visitor Identification
63
Cookies
Probably one of the most commonly used and most accurate methods of tracking visitor sessions is through the use of a persistent cookie. A cookie refers to some text that a web server sends back to a client machine the first time that client machine visits a web site. This cookie text gets stored on the client machines hard drive, and in subsequent requests to that web site by the client machine, the cookie is sent to the web server. Heres an example of a typical cookie text:
COOKIE_ID=10.21.151.222-92873123.102983222
Heres the process in three steps: 1. The client machine sends a request to the web server of a particular site for the first time. At this point, the client machine has no cookie information for that web site stored on its hard drive. 2. The web server processes that request and recognizes that the client request contains no cookie information. It then serves up the content requested by the client machine plus a cookie. Of course, for the cookie to function as a visitor ID, the cookie text delivered to the client machine must be unique. The web server also specifies a domain for which that cookie is valid. This way, the client machine knows which cookie to send for a given site
since client machines may have hundreds of cookies for a variety of web sites. 3. The cookie gets stored on the client machines hard drive, and during subsequent visits to the web site, the client sends the cookie to the server in the request. The cookie is logged into the cookie field of the web server log, and may be used later to associate the visitor to all other logged hits containing that same ID in the cookie field. The SmartSource Data Collector (SDC) has a cookie server component that delivers a cookie to a visitor if that visitor is new. Subsequent visits by that same visitor result in the cookie, which contains the visitor identifier, being sent to the SDC along with the web activity information. The cookie is generated by SDC and consists of the IP address sent in the original request appended to a decimal-separated number based on the time stamp of the request. Because the decimal-separated number uses the time stamp down to the nanosecond level, this combination results in a number that is almost guaranteed to be unique.
Visitor Identification
65
Some web sites attach a session ID to the users activity, and this ID is either recorded directly to the cookie field or in the URL query parameters of the web data activity file. Similar to processing visitor IDs, WebTrends can cut the session ID out of the query parameters field and paste it into the cookie field, but session IDsas the name impliesare only good for a given session. They do not persist across multiple sessions. In some cases, a session ID may have its own place in the record of a web data activity file and look like this:
SID=jhmbobkcb111inehlpkjhopabbe
Visitor Identification
67
Authenticated username
Probably the most accurate way to identify visitors is by using the authenticated username that they enter into an authentication dialog to access restricted portions of a site. In this case, an authuser entry is made in the web data activity file, with the value being the username the visitor entered into the dialog. In the following example for the record of a web data activity file, John Smith is the name of the authenticated user.
2002-01-01 00:12:12 server2.att.com John_Smith W3SVC3 HERC 192.168.1.1 GET
This could be an extremely reliable method if a web site made its entire site password protected. However, there are many reasons that web sites tend to only designate portions of their site as password protected. Typically, these are areas of content that the visitor paid a subscription to access, as in the case of an online newspaper, or pages in which the user enters information that they wish to keep secure, such as credit card numbers, contact information, and other personal data. For the authenticated username method to work, the entire site would need to be password protected so that each visited page would result in the username being logged in the authuser field. Here is another example of how authenticated usernames work: Consider the Yahoo sub-site, My Yahoo. To gain entrance to My Yahoo, you first had to register for the site. You probably entered your first name and last name, your address, your email address, your phone number, your zip code, and perhaps answered a survey with information about your background such as single versus married, income level, interests, occupation, and more. Yahoo takes that registration information that you entered and creates an external visitor database. Each time you log in to the site, you enter your username and password. That user name shows up in the authuser field for any web data activity file hit made to an authenticated area of the site. The value in the authuser field is then used as a key to tie these hits to your visitor characteristics data in the external database Therefore, anytime a visitor visits a site, no matter what computer that visitor logs in from, his or her username remains the same. By using authenticated usernames you can also eliminates aliasing that occurs when two or more visitors use the same machine to get to a site. Each user must enter their unique username and password. Figure 4-3 shows a sample report of authenticated user names that visited most often.
Visitor Identification
69
Summary
In order to gain more meaningful insight into visitors behavior on your web site, you need to be able to assign each hit in a web activity data file to the visitor responsible for that hit. You then need to be able to look at a specific visitors activity and determine that this activity occurred during one continuous visit session or over multiple visit sessions. The key to all this is how you associate a visitor with each web log record. There are several different identifiers that you may use to do this: Client IP address or domain name Combination of IP address and agent information Cookie (persistent or session-only) Session IDs Data embedded in the URL Authenticated user A cookie, session ID, or authenticated username provides fairly accurate visitor identification, though you will likely have some background work to do in order to use these as identifiers. Your other main options are an IP address or a domain name. These two identifiers are readily available, but both are severely limited in how accurately they can identify visitors. Determining how your visitors behave on your web site is one of the most powerful aspects of web analytics. For this reason, you may want to invest the time that it takes to employ one of the more accurate means of identifying your web visitors.
Session Termination Time Frame Click on Options > Session Tracking > New Session Tracking Definition. Domain Name Click on Options > Analysis > Domains. IP Addresses, Cookies, and Authenticated Usernames Click on Options > Session Tracking > Edit a session tracking definition.
Visitor Identification
71
Chapter 5
Defining Behaviors
After you understand how to collect activity data and what it looks like (both discussed in Chapter 3), and you understand the concepts involved in identifying your visitors (discussed in Chapter 4), you are ready to understand how to convert this raw activity data into something that matches the organization of your web site. WebTrends web analysis provides a set of pre-defined reports on a variety of visitor behaviorsthe top pages visited, the top visitors, the top entry pages, the top referrersall standard information available from data files whether captured traditionally or via a clientside tag. Figure 5-1 on page 74 shows a sample Pages report.
Defining Behaviors
73
To create basic measurement reports, you dont have to do much more than tell WebTrends where the web activity data is located. Basic reports can be useful indicators of general web site activity, but theres a lot more you can learn from WebTrends if youre willing to put in a little effort. The real benefits of WebTrends are found when you use it to identify and improve those areas of your site that are not working optimally or are reflecting traffic patterns far different than what you expected. For example, are people linking to a specific page on your site after viewing an advertisement that you intended for them? If not, you may want to reconsider the advertisement. Do people who begin to make a web-based purchase actually complete that purchase? If they abandon the purchasing process, then perhaps its
time for you to examine that process more closely. So how can you determine whether your web site provides the functionality and gets the results that you intended? The answer is by understanding how your site is designed and then focusing your web site analysis on those functional site areas. Specifically, you need to tell WebTrends what the specific parts of your site were created to do.
URL classification
So how do you focus your analysis on just the web site content that matters to you (or to the person who asked you to report on this content)? The answer is actually straightforward: tell WebTrends which pages, groups of pages, and other web-based content you want to examine. In WebTrends lingo, this is referred to as URL classification. URL means Uniform Resource Locator. The URL is the address of a resource, or file, available on the Internet. The URL contains the name of the protocol required to access the resource (for example, http or ftp), a domain name that identifies a specific computer on the Internet, a directory and pathname on the computer, and sometimes query parametersfor dynamic web sites. Figure 5-2 shows the URL format.
Defining Behaviors
75
If the URL is the address of a static web page, then query parameters are not involved. Static pages send exactly the same response to every request. For example, a page on the internet may be located at http://www.ietf.org/rfc/rfc2396.txt. This information describes a web page to be accessed with an HTTP (web browser) application that is located on a computer named www.ietf.org. The pathname for the specific file in that computer is /rfc/rfc2396.txt. If the URL is the address of a dynamic web page, then query parameters are involved. These parameters, not the page names, identify the pages content. The dynamic web page is simply a way to dynamically generate larger sites from database architecture, making it significantly easier to maintain pages as the site grows. For example, http://clothingshopping.com/category.aspx?catID=211 indicates a specific page at clothingshopping.com that sells childrens clothing. In URL classification, you use a pages URL and perhaps also its URL query parameters to identify and then classify that page according to its function.
To learn which product is being selected, however, you need to examine the URL query parameters. In the example of the sunburst yellow cell phone cover, the URL, followed by the URL query parameters would look something like:
www.zedesco.com/cart/order.asp?order_ID=10334& product=cellaccessories&type=cellcover&opt_type=color&opt= sunburst%20yellow
You could classify the page using only the URL stem (cart/order.asp) to collect all visits to the order page, regardless of what type of product was ordered. In this case the function of the pages would be to let web visitors order products. However, to get more information, you would use the URL query parameters to classify the page visit in more detail. In this case, you would classify the page as belonging to the group of cell phone accessories items ordered. WebTrends analysis products allow you to easily associate URL query parameters with an item or a group of items ordered.
Note: This book draws on examples from a hypothetical company called Zedesco Communications that sells electronics. Consequently, this book often refers to the Zedesco Communications web site, www.zedesco.com.
Content groups
Content groups designate pages with related subject matter. This grouping allows you to track the visitor interest in subject matter rather than in individual pages, which makes interpreting visitor interest far more intuitive. By grouping together related pages, you can also track web activity on your site from perspectives that may not be inherently possible with your sites current organization. Lets look at two example of content groups: one for a site with static web pages and another for a site with dynamic web pages.
Defining Behaviors
77
and
news/domestic/article1.htm news/domestic/article2.htm news/domestic/article3.htm
These content groups specify that you gather visits to some pages in the international folder and visits to other pages in the domestic folder.
and
default.asp?div=news&type=domestic&article=1 default.asp?div=news&type=domestic&article=2 default.asp?div=news&type=domestic&article=3
In this case, you would track the page default.asp that had the parameter div with a value of news and the parameter type with a value of domestic or international. With web server logs, you have to tell WebTrends which pages belong in each content group. As WebTrends parses the records, it looks for entries that belong to a given content group. By contrast, when using a data collection server, content group information is accumulated as
the pages are served. This is because when pages are created, if they belong in a specific content group, you can include the name of the content group in the pages META tag information. The SmartSource Data Collector knows to look for this information, and then sends it on to WebTrends for reports or to a web data warehouse. By using SmartSource Data Collector, you only have to configure a page one time to associate it with a content group. Of course, even if your are using SmartSource Data Collector, the WebTrends engine can still be configured to recognize content groups from the raw URLs. Figure 5-3 shows a sample Content Groups report. This report identifies the most popular groups of web site pages and how often they were visited.
Defining Behaviors
79
Product groups
Product groups are a specialized type of content group that help you to track pages specifically related to products you sell or promote on your site. WebTrends analysis products track product groups separately because products are such a high profile component of most sites.
Keep in mind that some of these pages represent cordless phones, others represent cell phones, while still others are cell phone accessories (in the accessories directory). Note: A large, database-driven site that uses dynamic URLs would use the following structure: where
products/info.asp?prod=1783&cat=13 13 represents cordless phones 1783 identifies SBC-2905
If you wanted to capture cell phones and cell phone accessories in a product group, you would capture the following, assuming that the travel chargers, car-kits, headsets, and the video games are cell phone accessories:
products/phones/cell phones/XT2100.htm products/phones/cell phones/SCH-N300.htm
However, note that headsets could overlap into a cordless phone accessories product group. It is common for pages to have several places where they might be logically grouped. To capture cell phones and their accessories, you would tell WebTrends to take all content in the \products\phones\cell phones directory, and group them with the individual pages for the remaining items. In this case, that would mean you would tell WebTrends to group visits to the cell phones directory pages with visits to the following accessory pages: travelcharger.htm, covers.htm, headset.htm, and videogame.htm. Figure 5-4 on shows a sample Product report. It represents the number of visits during which product-related pages were viewed.
Defining Behaviors
81
Scenario analysis
In the context of defining your sites structure for WebTrends, you need to know which areas of your site, if any, contain sequences of pages that make up a web-based task you want your visitors to complete. These sequences of pages are called scenarios, and some of the most common examples of scenarios are registering as a user of a web site, making an online purchase, or filling out a survey. For example, Zedesco has a registration process that requires web visitors to fill out the following pages to complete their registration: Start of information request Verified information Completed registration These steps constitute a registration scenario. Another common scenario is the shopping cart scenario, in which your visitors proceed through a series of steps to purchase products. Other, less familiar sequences on your site may also be important to trackfor example, a sequence of product pages that you want to make sure visitors are viewing, or if you are a travel web site, a set of pages that your visitors must complete to track prices for their top flight itineraries. Figure 5-5 shows a Registration Conversion Funnel report. This analysis offers insight into each step along the information request process. Each step shows a drop-off as visitors move through the funnel.
Defining Behaviors
83
Advertising views
If your company hosts advertisements on its site, it can be very important to show your customers how much traffic the ad youre hosting for them generates. In addition, the development of pricing schedules may be heavily dependent on where the ad is placed. You may need to provide numbers to potential customers that show how valuable a particular piece of web real estate is for advertising. Reports on traffic generated by ads placed in various areas of the site can let your customers balance level of exposure versus cost when making their decision about posting their ad. Advertisements can be broken into two parts: Ad View Visitor views a page containing the ad graphic or link. Ad Click Visitor actually clicks on the ad and opens its content. Depending on the ad hosting method, both the ad itself and the content it links to may be hosted on your site. However, it is also common to host the ad on your site, yet have the content of that ad hosted by your customer, on their site. In the first method, the Ad View and the Ad Click that results in the ad content display are both logged to your web server log because all activity occurs on your web site. In the second method, the Ad View activity is logged to your web server log, but the act of displaying the ad content display is logged to your customers web server log, not yours. You can get around this issue by implementing server-side scripting (for example, CGI, Perl, or ASP) to perform a redirect to the destination URL. A very common Perl script is redir.pl. This redirect command sends the hit information back to your web servers data activity file, and is recognized as an indicator that the ad was opened. Of course, if you are using a data collection server or client-side tagging method, you can easily collect this information by running a script each time an Ad View or Ad Click occurs. An ad click is an indicator of greater interest in the ad than an ad view is because it implies that the user focused directly on the ad and was interested enough to click on it. Figure 5-6 on page 86 shows an Onsite Ad Impressions report that shows how often specific ads were viewed.
Defining Behaviors
85
In the Onsite Ad Impressions report note that the Ad Views Visits column refers to the number of visits by visitors who saw the specified ad. A visit is a series of actions that begins when a visitor views a first page from the server and ends when the visitor leaves the site or remains idle beyond the idle-time limit. The default idle-time limit is thirty minutes. This time limit can be changed by the system administrator. Therefore, a visitor may see an ad more than once during a visit, but the ad will only be counted once in this table and graph.
Parameters to those pages control their actual content, and so it is those parameters that need to be included along with the page name. For example, using the dynamic URL:
default.asp?type=domestic&div=news&article=104&sessionid=155428642
You may find it most informative to know which division and type of articles are being viewed. It makes sense to include those parameters in the pages URL for reporting. Including the sessionid is, however, not at all desirable, since it makes every page access appear to be different content. WebTrends allows you to rebuild the URL, specifying just which parameters to use. In the
Defining Behaviors
87
example above, you may want to include the div and type parameters only. This could be used to transform the URL above into:
default.asp?div=news&type=domestic
Using the URL rebuilding feature, the Pages report becomes more enlightening.
Page
default.asp?dif=news&type=domestic default.asp?dif=financial&type=domestic default.asp?dif=ads&type=domestic default.asp?dif=news&type=international catalog.php?dept=clothing catalog.php?dept=hardware catalog.php?dept=kitchen catalog.php?dept=advice&type=domestic catalog.php?dept=food
Note that the parameters are sorted alphabetically. This ensures that two URLs which differ only in order or parameters are still considered to refer to the same content.
File types
As web site development and publishing have become more involved, so have the types of content that can be hosted by your site. In addition to standard HTML documents, sites also host downloadable files for Flash presentations, Microsoft Word documents, Adobe Acrobat .pdf files, compressed files, video files, audio files, executables, and so forth. You need to tell WebTrends how you want it to view various file types based on their file extensions. While at first it may seem obvious which file types are documents and which are downloadable files, consider how you might classify the following Adobe .pdf file.
/club/kb/Nokia C23/owners_manual.pdf
Is it a downloadable file or a document? Really, it depends on how you expect visitors to use it. For ambiguous cases such as these, you must tell WebTrends how to treat each file extension. That way, when your analysis software parses the web data activity file and encounters a record such as the request for the Nokia C23 Owners Manual, it knows what to do.
Defining Behaviors
89
You will want to divide files this way so that you can determine whether or not your visitors look at certain types of files. If you devote a substantial portion of the budget to creating multimedia pieces for your site, you want to know that your investment is paying off. You may also have the same information presented in multiple formats and want to know which format your visitors use the most: static documents or interactive elements.
Summary
Many people who have WebTrends never realize the full potential that lies in the features it provides. Instead, they only venture as far as using the standard reports that ship with WebTrends and track information about the entire site, not specific pages or areas of the site. The real value in web analytics is in identifying and examining specific areas of your site in detail. Typically, these areas are ones that allow web visitors to complete an action, such as making a purchase, researching a product, or solving an issue by reviewing online support materials. The tools provided with WebTrends allow you to track visitor behavior: visits to content and product groups, the steps in a scenario, clicks on advertisements, and the paths that visitors took through your site. All of these tools can help you focus on your site to find what is working and what needs some improvement.
Defining Behaviors
91
Consideration Your web site is organized so that it can be searched according to content groups? You need to know the visits and hits for each content group? Your web site is organized so that it can be searched according to product groups? You need to know the visits and hits for each product? Your web site has scenarios that you would like to analyze (for example, shopping cart)? You need to know the number of Ad Views and Ad Clicks on your ads? Your home page name changes now and then?
Yes
No
Comments
Chapter 6
93
After understanding global and local filters, you can consider two types of filters that allow you to specify which data to analyze: include filters and exclude filters. Include filters specify the data to use in the analysis. Exclude filters specify what not to include in the analysis. Sometimes it doesnt matter which filter you use, but at other times, one kind of filter is distinctly more convenient to use than the other. You can easily apply the concepts of including versus excluding data with two different levels of filtering: filtering on hits and filtering on visits. The remainder of this chapter describes how include and exclude filters work with hit filters and visit filters. By understanding the concepts involved, you will analyze data that pertains to your needs. If you choose to apply no filters to your web-activity files, the analysis software analyzes all the data. However, this may impact performance and analysis time, because your data records will contain information about images and other kinds of data that contain no real value.
94
Hits
When your web server or data collection server records visitor activity, each line in the record represents a hit to the server. Hits are the individual activities that combine to make up a visit to a single page. Think of the contents of a typical web page. Most consist of some text and one or more graphics. When users request a page, they are actually making requests for each item on the pagemaybe a GIF image of a company logo, some HTML text, and a JPEG image. The server either successfully or unsuccessfully handles each item, and then logs the results of the request for that item, or hit, along with other information about the hit. One record in the web activity data file equals one hit. Actually, with web server data files, this one record does equal one hit. However, for clientserver tagging, WebTrends SmartSource Data Collector server data files do not typically record hits to graphics images. In the case of a SmartSource Data Collector server log, you will typically only have page hits.
Visits
A visit, or a visitor session, includes all the pages a unique visitor requests during a period of continuous activity on your site. Consequently, it includes all the hits associated with those pages in the visit. Visits are considered closed after the visitor remains inactive for a specified period of time. As a general rule, a visitor session should be closed if the user remains inactive for 30 minutes, although your WebTrends administrator may wish to specify a timeout period that is more in keeping with your web site analysis requirements.
95
Requested URL
You may decide that you need to include or exclude certain pages from analysis so that you can focus more directly on specific areas of the site. For example, if you are part of an IT organization, you may wish to determine whether your web visitors are viewing your knowledge base articles, all of which have a prefix of kb_. You could either list all of the knowledge base articles you wish to track, or, since WebTrends supports wildcard usage, you could specify that your filter includes all files beginning with kb_. If your site uses a content management system, then instead of specifying pages to include or exclude, you may need to specify a page and any URL query parameters that grabbed the content displayed in that page. An example of knowledge base articles that you may wish to track web activity for could be for issues with the P100 cellphone. The excerpt below is a hypothetical web data activity file entry that shows how this could appear:
2001-03-04 00:25:51 proxy1.thegrid.com - W3SVC3 web1 192.168.1.1 GET / support/default.asp product=p100&id=kb_5
96
The query parameters are product and id, where product=P100, and id=kb_5. You could track activity for P100 articles by specifying that your analysis include all hits with the page, default.asp, the product query parameter having a value of P100, and any records with an id value that contains the prefix kb_.
HTTP method
Your web server log may show requests using several different HTTP methods, but most frequently, you will encounter GET requests. These requests, when logged, contain more useful information for analysis purposes than any other method. A GET request returns whatever information is identified by the request URL and associated query parameters. For example, if you are using the Internet, and you click on an image, the actual request for that image might look like this:
GET /picture.jpg HTTP/1.1
In a distant second place is the POST method, which some web sites use to post forms. A couple of other rarely used methods are PUT and HEAD. These methods seldom contain useful information for web analysis, and because they are used infrequently, they may never appear in your web data activity file. Typically, your web traffic analysis will process GET requests, though if your site has forms that use the POST method, you may wish to track activity on those forms. WebTrends has the capability to exclude records of requests using methods you dont want to track. Of course, you could also choose to include only those methods you do want to track and the results would be the same.
Cookie
As mentioned in Chapter 4 (see Cookies on page 64), cookies can be a means by which WebTrends can recognize visitors. However, cookies are used to store various types of information, such as shopping cart contents, time of first visit, and number of visits. By selecting an appropriate cookie, you can investigate the behavior of a specific segment of your visitors. The cookie filter is typically used for this investigative purposes. This can be useful, for instance, if you know of visitors whose activity is not pertinent to your analysis, and you wish to exclude their activity.
97
Multi-homed domain
If your site is spread across multiple domains on the Internet, you may want to view the activity of only one domain. You may also wish to exclude the activity of one or more domains. A multi-homed domain filter lets you specify which domain or domains to filter from the analysis. Lets say that your company is based in the US, but its site has sub-sites in the US (www.yourcompany.com), some in France (www.yourcompany.fr), and some in Germany (www.yourcompany.de). If you only wished to view the main US site, you might wish to either exclude the French and German sites, or it might be easier to include only data from the US site in the analysis. For users of SmartSource Data Collection, the multi-homed domain filter can also be used to filter out hits from sites that may have copied pages and the SDC script included in that page (recall the discussion of client-side tagging; see Using client-side tagging on page 49). Another use (by filtering in) of the multi-homed domain filter is to identify sites that have stolen copyrighted material.
Browser
With all the different types of browsers available today, you may want to get a sense of the types of activity carried out from various flavors of browsersInternet Explorer, Netscape Navigator, WAP and Palm device browsers. You may even want to know if activity originated from a robot or spider crawling your site. Your web data activity files typically contain a reference to the browser used to access content. The files also record visits from spiders and robots in the same browser and browser version field. If your business has a portion of its site devoted to WAP devices such as cellular phones, and you wish to examine visitor activity on only those WAP-specific areas, you could tell WebTrends to only analyze requests originating from WAP browsers. The excerpt below shows a possible web data activity file entry that would be included in analysis if you created an include filter for WAP device browsers.
2001-03-04 08:39:02 208.18.146.75 - SERVER10 WEB1 - GET /wml/products/ wireless/phones.wml - 200 0 647 543 0 80 HTTP/1.1 UP.Browser/3.1.03NK02+UP.Link/4.2.1.7 WEBTRENDS_ID=133.205.252.8-2562687908.34229567 -
A portion of this excerpt refers to the browser and browser version number used by the client making the request:
UP.Browser/3.1.03-NK02+UP.Link/4.2.1.7
98
You may also wish to compare the types of activity you experience from a specific standard HTML browser such as Netscape or Internet Explorer. Because these browsers handle HTML code slightly differently, comparing the visitor experience on one browser with another can reveal valuable information. For such a comparison, you could create an include filter for each browser of interest and then review analysis results for each browser. For example, if you find that Netscape Navigator users drop out more frequently in a shopping cart scenario than do Internet Explorer users, this may indicate that the HTML code does not appear as you had intended on browsers using Netscape Navigator. Although web designers always try to review their sites in several different versions, it's easy to miss problems with design when you have numerous pages to review or if testing is not thorough.
Return Codes
Return codes indicate whether or not requested content was successfully delivered, and if not, what the problem may have been. Return codes in the 200s and 300s indicate a successful content delivery, while those in the 400s and 500s indicate a failed delivery. For most web visitors, the most well-known and irritating error is the standard 404 File Not Found error. In the web activity data file, this appears as a server to client status entry. The following data file entry shows a successful return code of 304 (Success Not Modified) in the first data file entry, and a success return code of 200 (Success OK) in the second data file entry. Both return codes are highlighted in bold print:
2001-03-04 00:03:23 computer.attcanada.ca - W3SVC3 web1 192.168.1.1 GET / club/kb/s32/motors.wmp - 304 0 27000 58 412 80 HTTP/1.1 Mozilla/ 4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) WEBTRENDS_ID=10.14.211.5292873123.102983222 2001-03-04 00:04:09 computer.quest.com - W3SVC3 web1 192.168.1.1 GET / dealers/default.asp WT.sv=Web%20Server%201&WT.ti=Dealer%20Home&WT.tz=420 &WT.ul=en&WT.cd=32&WT.sr=1024x768&WT.jo=Yes&WT.js=Yes&WT.co=Yes 200 0 37211 121 389 80 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+5.0b1;+Windows+NT)
Because 400 and 500-level errors indicate potential problems with your site, you may choose to create an include filter that analyzes only the activity on failed requests. You can then determine which pages may have problems that are preventing users from accessing your content and modify those pages to resolve the problem.
IP Address
What if your company just launched its web site after a major site redesign? Your company had a big launch party, and all the employees afterwards decided to look at the redesign on their own. You probably wouldn't want to include their visits in your analysis, so you could
99
simply filter them out based on their IP addresses or your companys domain name. Within each web data activity file entry is a field that indicates the computer address of the visitor. Depending on whether or not you instructed WebTrends to resolve IP addresses, this may either be an IP address or a domain name. Filtering on a visitors IP address or domain name allows you to include or exclude specific addresses in your analysis. You might also want to see levels of activity based on regions, country, or domain types. The web data activity file entry below with the bold highlighted entry shows a visit from a computer located in Canada, as evidenced by the .ca extension:
2001-03-04 00:03:23 computer.attcanada.ca - W3SVC3 web1 192.168.1.1 GET / club/kb/s32/motors.wmp - 304 0 27000 58 412 80 HTTP/1.1 Mozilla/ 4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) WEBTRENDS_ID=10.14.211.5292873123.102983222
If your web site caters to educational institutions, then you would be most interested in activity originating from educational organizations. You could capture this data for analysis by creating a filter that included all educational sites based on their domain type extension of .edu. Another use of the IP filter is to filter out monitoring software, such as Keynote, which is used to maintain the health of the web site. That is, companies and organizations with extensive web sites find it beneficial to have their web site monitored by special monitoring software. Every time the monitoring probes a given web site, all of its activity will be counted unless the IP filter has been used to filter out the monitoring software.
File
Many hits contain requests for images that have very little meaning for you. Besides overloading your system with meaningless data to analyze, you are likely more interested in the actual pages that were opened during a visit than the images your visitors saw. You can use a specific filter to select the file types, such as GIFs, JPEGs, and other image or graphics files, that you wish to exclude or include from analysis. Figure 6-1 shows a report that identifies the accessed types of files for your site and the total number of kilobytes of data transferred for each file type. The percentage column (%) reflects the percentage of all kilobytes of data transferred for the specified file type.
100
Directory
If your site is structured in such a way that various directories include specific types of contentthe products directory contains products content, the support directory contains all technical support content, etc.it may be helpful to look at various areas of your site by including or excluding content based on the directory and sub-directories in which that content resides. Tell WebTrends to include the directories that contain content of interest to you, or conversely, the content you wish to exclude from analysis.
101
To report on the specials ad, you would filter in only the ads/specials.gif file. To report on the ad clicks, you would filter in only the redirect/yahoo1.htm file and the return code 302.
102
Authenticated username
If your site requires that your users fill out an authentication process, you can include or exclude hits from specific visitors based on their user names. This concept is very similar to that mentioned in the cookies section earlier, though cookies can be used to filter more than just specific users. You might use an authenticated username filter if you found that a particular user who you do not trust is snooping around on your site. You could just as easily use an authenticated username filter to discover if a particular prospect is exploring your site. Refer to Authenticated Usernames report on page 69 to get an idea of this kind of report.
103
Entry page
The page on which the visitor first enters your site is the entry page. Filtering by entry page lets you include or exclude from analysis visits that started on specific pages. For example, if you have a redirect page youre using to track an ad, you might choose to only include activity associated with a visitor session that began with a visit through the ads redirect page. You might also want to view only activity of visitors who began their visitor sessions somewhere in the middle of your site, because these visitors often have more of a purpose in their visit than do visitors who enter at your home page. To do this, you would create an exclude filter that filtered out all visits to your home page. Figure 6-3 shows a report that identifies the first page viewed when a visitor visits your site, the number of visits to those pages, and the percentage of times this page was the entry page compared with other entry pages. The most common entry page is usually the home page, but other common entry pages include specific URLs that visitors type, pages that have been bookmarked, or pages referred to by other sites.
104
105
server log is the page on your site that they were last viewing before beginning the new visitor session.
Advertising campaigns
If you have advertising campaigns on your site you may wish to track the activity that is occurring on them. To do this you must create a campaign definition before you can filter on a campaign. This definition specifies the referring page or entry page that, when visited, represents a visit to the campaign, or, in the case of SmartSource Data Collector, query parameters. The most common use for filtering by campaigns is to include only the visitor session activity associated with a particular campaign. If you have a reasonable idea of the value that you can associate with specific activity, you may be able to forecast the revenue that can be generated by the campaign. Figure 6-4 shows a report that provides visitor activity for each campaign.
106
107
Filtering on visits is slightly more restrictive compared to filtering on hits. With hit filtering, you apply the filter directly to the raw web data activity file. Any hit that matches the criteria is either included or excluded from analysis depending on the type of filter you specified. When you filter on visits, however, the web data activity file has been parsed and processed to sessionize your data. At this point, you are not actually applying the filter to the raw web data activity fileyou are applying it to a summary of the hits associated with a visitor session. Regarding logging, you should keep in mind that because your browser wants to give you information as quickly as possible, it uses a process (called multi-threading) that allows multiple items to be uploaded at the same time. As these items either successfully or unsuccessfully load in your browser, they get logged to the web server or data collection server web data activity file. This means that if an image loads in your browser before the HTML text file that references it, your web data activity file will record the hit for that image first. WebTrends then needs to reorder hits in the proper time sequence so that the visit sessions are accurate.
108
Data aggregation
Once your web activity data finds its way to either a web server or data collection server log, that data is processed and stored in an aggregated format in a set of summary tables. In addition to the log data in summary tables, you can also add data from external sourcesfor example, demographic data or customer data. In creating these summary tables, you have summarized the data in defined ways, and then you have discarded the raw data from which you aggregated the summary data. Summary tables can be used for data visualization with reporting tools such as WebTrends analysis products, Key Business Indicator tables, or external data reporting and visualization
109
tools such as Crystal Reports. In addition, you can use these tables to perform deeper analysis with data mining tools, or you can run online analytical processing (OLAP) to uncover trends in your data that you might never think to consider.
Table filtering
As the data comes through the analysis process, the data will be put into various tables. Tablelevel filtering allows you to choose which data to include in a particular table. Whereas global filtering affects data that all of the tables are exposed to. With table filtering, you are selecting a portion of the entire analysis data set to be included in a single custom table. Default custom tables can include different subset of the analysis data. As with global filtering, table filtering can be based on hits or visit properties. Figure 6-6 shows an overview of the table filtering process.
110
111
Custom Reports
WebTrends analysis products ship with a number of pre-defined reports that cover the information most organizations want, but every organization has its own, unique requirements for the web activity information it needs to see. This is where custom reports are particularly useful. Custom reports allow you to set one or two table dimensionsfor example, you might want information about new visitors from a specific geographical region or with a certain income level. With custom reports, any dimension for which you have data, including any external data source you may have tied to your web activity data, can be tied to measures such as the number of page views, the number of visits, or the duration of a visit. If you need to narrow down what you view in the reports, you can apply filters to the report data just as you did when filtering the summary tables. WebTrends offers numerous dimensions for custom reports. Here are a few examples: Most Recent Campaign Product Manufacturer Search Phrase Lifetime Value Range Day of Week WebTrends also offers numerous measures for custom reports. Consider the following examples: Active Campaign Revenue Daily Buyers Daily Visitors Order Value Visitor Purchase Count Note: Not every measure-dimension combination makes sense. Some dimensions are very large and should be used wisely. For example, you dont want to use unique visitor with referrer, because the virtually unlimited number of unique visitors and referrers would overwhelm your tables. Custom reports support data look up that translates coded information from your database
112
into more meaningful descriptions. See Campaign IDs and translation tables on page 126. Here are descriptions of several custom reports that may be helpful when you consider the data you might want to analyze:
Buyers versus non-buyers by time period This report lets you see how many of your web site visitors purchase products from your web site. Compare the number of visitors who make purchases (buyers) to those who do not (non-buyers) by time period. Content group duration This report provides insight into which areas of the site are most attractive to your visitors. Analyze the content areas for possible cross-promotions, or analyze over time to interpret content popularity. Demand Channels This report shows activity occurring during the report time period segmented according to the demand channel of the last campaign to which a visitor responded. Geography drilldown This report provides a drilldown presentation of the geographical information (region, country, state/province, city) relating to the visitors IP address. The WebTrends GeoTrends Database is required to get complete information down to the state and city level. Marketing programs This report shows the marketing programs for the most recent campaigns that drove traffic to your site during the report time period. For the report time period, all conversions and other activities are tracked and attributed to the last campaign to which visitors responded. Thus, even if the conversion does not happen on the first visit generated by the most recent campaign, the appropriate source is credited with the conversion. Purchase conversion funnel by search phrase (all) This report helps to understand how the usage of all search engines and phrases correlates to conversion activity on your site. This report includes both organic (for example, natural search) and paid (for example, pay-perclick) search referrals. The conversion funnel allows you to analyze each step of the purchasing process to determine specifically where users are dropping off and which percentage completes the checkout process.
113
Sales cycle by product This report page shows the number of days between a new buyers initial visit and first purchase for each product.
Figure 6-7 shows the number of days between a new buyer's initial visit and first purchase.
114
115
new profile that only considers that section of your site. But if you look at the depth of analysis you need for this section of your site, creating hundreds of reports all specific to that section of the site may be overkill. It might be best to instead create a few custom reports that show you the traffic volumes and campaigns that are driving traffic to that section of the site. Likewise, if you need to apply different filters to the same segment of data (for example, one campaign versus a second campaign), you could create a separate profile for each campaign. Again, though, it may be excessive to create the hundreds of reports created by a full analysis profile. Better instead to consider custom reports for each campaign.
Summary
Filtering allows you to narrow down the volumes of web activity data to just the data you want to examine. Different types of filters can be used to focus on just the types of data you wish to analyze. You can apply filters to each line in the web activity log using hit filters, or you can apply filters to visits using visit filters. Visit filters are applied after the individual hits have been filtered on and the visit data has been sessionized. You may also specify which data to include or exclude from your reports. Indirectly, this is a way of reducing, or filtering, the data that you see in reports. The benefits for filtering data include not only reducing the amount of data that you need to store in your tables of aggregated data, but also making the amount of data you do want to examine more manageable.
Filtering Click on Web Analysis > Profiles & Reports > Edit a profile > Advanced > Hit Filters or Visitor Filters Custom Reports Click on Web Analysis > Profiles & Reports > Report Configuration > Custom Reports
116
Filtering Worksheet
Use the following worksheet to help understand what kind of filters you need.
Consideration Do you plan to use image files (such as .jpg, .gif, or .tif files in your analysis? Do you plan to include spiders and robots in your analysis? Do you plan to include hits from people within your own company who look at your web site? Do you need high-level reports on ad campaigns or more reports on browsers and technical information? Is visitor segmentation important to your analysis? Is site segmentation important to your analysis? Does your company have many divisions requiring parent-child profiles? (Note: Parent-child profiles are only available in WebTrends Professional and WebTrends Enterprise.) Yes No Comments
117
118
Chapter 7
Acquisition Metrics
Introduction
Nearly every web site shares three fundamental web analytics objectives: acquire more qualified visitors for the lowest cost, convert these visitors into customers, and retain these customers for repeat business.
Acquire more qualified visitors From online marketing to offline marketing, the first step in winning new customers today is driving new traffic to your web site. But all traffic is not equal. You need to drive the most qualified visitors for the lowest cost. With WebTrends you can get a complete picture of campaign response, campaign conversion and overall return on investment (ROI). As a result, you can pinpoint exactly which campaigns are working and which arent.
This chapter discusses acquisition in more detail.
Convert more visitors by analyzing click-by-click behavior Whether your web site goal is for visitors to register, make a purchase, or get technical support, conversion rate is a critical measure of your sites success. WebTrends provides the most comprehensive navigation analysis in the industry, allowing you to track visitors click-by-click, identify confusing navigation and minimize abandonment. Isolating problem areas in your site and experimenting with improvements can have a big payoff.
See Chapter 8, Conversion Metrics on page 139 for more information about conversion.
Retain more visitors by segmenting those most likely to return Once youve persuaded visitors become customers, you need to retain them as loyal, returning customers. It typically costs 5-10 times more to acquire a new customer than to keep an existing one. WebTrends allows you to evaluate the effectiveness of your loyalty campaigns such as customer newsletters by how
Acquisition Metrics
119
recently and how frequently visitors are coming back and engaging in repeat business. Now you can measure whether or not you are increasing the average lifetime value of your visitors.
See Chapter 9, Retention Metrics on page 159 for more information about retention.
Entry/Landing page
The first page that a visitor sees on your web site is called the entry or landing page. This is the most important page in your web site, because it provides the initial impression for your visitors and influences whether they will continue to look at other pages of your site. Entry pages can tell you whether or not people more often start at your home page or jump to the middle of your siteusually via a bookmark or link. Consider also that at the entry point to your site, visitors have not yet begun to navigate around the pages of your site. This may be an opportune time to guide them in the direction you want them to go. From these pages, you can promote areas of your site that you want them to see by putting noticeable links to those areas. In addition, entry pages usually provide good advertising real estate if you sell ad space on your site or promote your own products or services.
120
Acquisition Metrics
121
In this sample report, Pages refers to any document, dynamic page, or form. Different types of profiles have different default settings for which file extensions qualify a file as a page. Visits refers to the number of visits where the specified page was the entry page. A visit is a series of actions that begins when a visitor views the first page from the server, and ends when the visitor leaves the site or remains idle beyond the idle-time limit. Also, in this sample Entry Pages report, the home page or Welcome Information page is the top entry page. However, many visitors entered first through the products and store pages. Perhaps many of these visitors entered because of an ad campaign. If so, this ad campaign may deserve more scrutiny, because the company may have spent quite a bit of money on attracting customers via that campaign. The information in the Entry Pages report can indicate how you might want to optimize the architecture of your web site based on where your visitors are entering. It can also help you determine which external links are most effective. You may want to consider updating META tags and links.
122
The following subsections discuss the mechanisms that help visitors find your site. These mechanisms are referrers, ad campaigns, search engines, and email marketing efforts (such as newsletters).
Referrers
Just as a doctor receives new patients from a referring sourcesuch as another doctor or a current patienta referrer, or referring URL, is the page on another web site that linked visitors to your site. Referring URLs tell you where your visitors came from to get to your site. You can use this information to determine which external sites are the best ones to place links on, or ads for, your site. This information can also convince you to develop or maintain positive relationships with these sites so that they will continue to offer a link to your site. How do you determine what the referrer is? The record in the data file contains the page that was visited before the page represented by a particular entry. So you can ascertain the referrer for each page from the record in the data file. But more interesting is what initiated a visit to the site. How do you determine the referrer for the visit? This is done by taking the first hit in the visit, looking at that hits referrer, and calling that the visits referrer. Therefore, all of the referrers URLs come from the first hit of the visit. Figure 7-2 shows the domain names of sites that refer visitors to your site.
Acquisition Metrics
123
From this sample report, you can get basic information. However, if you have several different ad campaigns on Yahoo, this report doesnt reveal which one is working best. Consequently, the referrer reports provide general, low-level feedback on your efforts. For more specific information, you will need reports on ad campaigns, search engines and email marketing.
124
Acquisition Metrics
125
Ad campaigns
Advertisements can come in many forms, including ads on other sites, popup ads that are triggered, and links embedded in email campaigns. Here are some broad definitions of ads that are frequently used:
Web-based ads These ads include banner ads that appear on the web pages of sites that your best prospects are likely to visit. Web-based ads have many forms such as text, moving graphics, a call to action (Click here to download ), Flash or streaming banners, pop-ups, and pop-unders. Newsletter-based ads These ads are directed at publications that your prospects are most likely to be reading. With newsletter ads, you can often choose from among sponsoring the newsletter, sponsoring a column or feature in the newsletter, or placing an ad that will appear among other ads, usually as a text ad.
WebTrends allows you to have a text file that (at analysis time) can translate all of your campaign IDs into their corresponding campaign names.
Redirect pages
Many ads are designed to initially route the user through a redirect page before they can view the ad content. This redirect page quickly and imperceptibly bounces the visitor to the actual page with the ad content, recording the redirect page as the entry page for the session because it was selected first. (Here, the first hit recognized as an ad campaign in the visitor session is counted.) If each redirect page for each placement is distinct from the others, you can track which version of the ad most often took you to the ads content.
126
Lets say you have two online ads for your product, one on Yahoo, and one on AOL. In addition, you sent an email to potential customers with a link that takes them to the content. If you wish to track them all separately, you would create a separate redirect page for each one. In this scenario, you might have the following pages: Yahoo Ad: /redirect/yahoo_ad.htm AOL Ad: /redirect/aol_ad.htm Email Ad: /redirect/email_ad.htm By tracking visits to each of these redirect pages in the top entry pages, you can see which ad placements most effectively bring people to your site. Figure 7-3 illustrates the redirect process for WebTrends using the web server data collection method. Remember that clientside tagging will not give you this information unless the redirect page has the proper script (see Drawbacks of client-side tagging on page 52).
Using this illustration, if you looked in the web data activity file you will see a two-step process:
Acquisition Metrics
127
This took the visitor from the Yahoo.com to the Yahoo redirect page (YahooAd.htm). Status code 302 means that you were redirected. The second web data activity file entry:
GET PromoAd.htm - 200 - YahooAd.htm
This took the visitor from the Yahoo redirect page (YahooAd.htm) to the promotion ad (PromoAd.htm). Status code 200 means that you were successful. Figure 7-4 shows a sample report of top referring pages.
128
In this sample report, Page refers to any document, dynamic page, or form. Keep in mind that different types of profiles have different default settings for which file extensions qualify a file as a page. Any URL containing a question mark is considered a dynamic page. If Direct Traffic is 100% of all your traffic, then your web server is probably not logging the referrer field in your data files.
Acquisition Metrics
129
You can use WebTrends to create a campaign profile and track either entry or referring pages. However, some ads have several possible referring pages with long, complicated URLs. As a result, it can be more difficult to look up and define a referrer when you set up a campaign profile.
Search engines
Search engines play a large role in acquiring visitors. Whenever someone uses a search engine, there is the chance that they will use a keyword that triggers links to your web site. Search engines typically come in two flavors:
Paid Search Engine You pay a fee for every person who clicks through to your site, and you have to monitor which keyword phrases are bringing you the best visitors. With Paid Search Engines, you need to evaluate the effectiveness of money spent. Organic Search Engine You pay nothing for visitors who come to your site. You monitor which keyword phrases are bringing you the best visitors. With Organic Search Engines, you evaluate the effectiveness of time spent.
130
People use search engines when they dont know the name of your site or have no other direct link to click or distinct URL to type in their address box. Web site designers go to great pains to figure out how to get recognized by these search engines and appear in the top 10 list that appears when a search is performed. Research consistently shows that more than 80% of web visitors use search engines to find what they need. The longer users are online, the more likely they will use search engines and make purchases. Since most web users believe that those sites that show up in the top of the listings are the most important sites, you must take every reasonable measure to make sure your site ranks highly with search engines for the search keywords and phrases that your most valuable prospects use. If you cant get good rankings by optimizing, you can always try payper-click advertising options, which most of the search engines offer. But search engine technology constantly changes. What you did to get a search engine to effectively recognize your site or page today can have marginal results only a few months later. And each search engine has its own proprietary method of creating a result list based on the search phrases or keywords a web user enters. These lists use the search keywords and phrases to create a list of what they interpret as being the most relevant sites. Search engines also use a host of other factors, including how often visitors click on the link to your site from within their list, and how many of the more popular sites containing related content have hyperlinks to your site. Most search engines also let you register with them, and by paying them to place your site in their index, you can get more exposure than if youd left it up to chance to get noticed. The element that you have direct control over in this mix is making sure that the keywords you planned for visitors to use to get to your site actually make your site appear in the search results. By reviewing the top keywords or search phrases entered by visitors, you can find out if those keywords are driving people to your site. If not, you can modify your web page content to promote your site with search enginesbased on those keywords. Some common ways to modify that content involve including the keyword or phrase in the description and keyword meta tags, and increasing the frequency with which you use the word or phrase in the HTML title, a headline, and first few paragraphs of the page. These methods will improve your chances of being found and promoted by a search engine. Note: Search engine optimization is not the focus of this guide. Please consider other resources for a complete discussion of this constantly changing topic. You can use WebTrends to find the search engines that are used most often by visitors to arrive at your site. You might want to register with search engines if you find that your site is not being noticed. With WebTrends you can generate reports on organic search engines (non-paid search engines) and paid search engines. Figure 7-5 shows a sample report about most recent search engines.
Acquisition Metrics
131
WebTrends allows you to compare this information with information from a report on the most popular phrases for your site. Figure 7-6 shown a Most Recent Search Phrases report.
132
Using the information from Figures 7-5 and 7-6, you can compare search engine rankings with the popularity and competitiveness of phrases to get a complete picture of how the web site is performing. Search engine rankings allow you to understand where your site shows up in the list of search results for certain phrases; for example, if you have a phrase that performs particularly well in terms of conversion, but your search engine ranking is low, you may want to try for more highly qualified traffic by boosting your ranking. WebTrends can also analyze paid and organic search engine usage and generate reports that show the total effectiveness of your search engine marketing and optimization strategies based on activity, depth and duration of visit. You can receive separate reports on paid search engine, or organic search engine, or both.
Acquisition Metrics
133
Email marketing
When you want to reach prospects inboxes, but you need to say more than your would in a newsletter ad, you might consider using direct email and your own customer database, as well as renting a marketing list. You can also email to your in-house list of registered visitors, who have opted-in to receive communications. By using email marketing, the recipient can click on a link to your web site, and this visit is automatically recorded and catalogued by WebTrends. You can use WebTrends to track email campaign results via entry/landing pages as a primary or complimentary metric to the other measures produced by email solutions. WebTrends can help you to determine how far recipients get into the conversion process, as well as what they do once theyve completed the process and on subsequent visits. Advanced email solutions will track clickthroughs to the site, campaign conversions and revenueand in some cases visitors clickstreams/pathsbut this is where the overlap with web analysis solutions ends. Unless the visitors activities are tied directly to the campaign, meaning the visitor entered your site through the link contained in your email, viewed campaign details/pages, and converted on the campaign offer, most email solutions will not measure it. You can make your entry pages useful by creating specific landing pages for each email marketing campaign and make sure that each landing page is not linked to anything except the specified campaign. That is, only the intended email marketing campaign should link to the pagenothing else on your web site should link to it. Then the landing page redirects the visitor to the page that you want them to view. To analyze the detailed interactions your email visitors have with your site beyond summary campaign information such as the number of responses and conversions, you will need a WebTrends solution. If visitors left campaign-centric pages, where did they go? What content groups or products (beyond the one featured) most interested them? Did email recipients purchase products that werent featured in the campaign? All of these questions can be answered by using WebTrends. Figure 7-7 shows a report that provides information about all types of campaigns, including e-marketing.
134
This report lets you compare different kinds of campaign types to see which are the most effective. Of course, the effectiveness is related to how much money you are spending on each campaign.
Acquisition Metrics
135
parameter field containing a parameter ID. The ID can be used to identify all of the attributes that make up the campaign, such as site name (for example, MSN, Yahoo), program (3rd quarter ProductName upgrade), offer (25% off), creative type (120x120 GIF banner), creative (race car image), and so forth. You can then use a translation file (via WebTrends script or custom table lookups) to create reports on which attributes are most effective (for example, did the race car image do better than the Flash movie of a tornado, or was the 25% off offer more effective then the free year of support).
Summary
Acquisition is the most expensive step in getting visitors to your web site. Monetary expenditures on advertising, search engines, newsletters, and similar campaign efforts often make up the large share of a companys budget. But without visitorsespecially qualified visitors your web site is meaningless. Once you have customers, you can work on converting and retaining them. Fortunately, conversion and retention are far less expensive.
Entry Pages and Referrers Click on Web Analysis > Report Configuration > Campaigns > New Campaign Ad Campaigns Click on Web Analysis > Report Configuration > Campaigns To create a report about ad campaigns, Edit a sample profile and click Visitor History. Make sure that Campaign History is checked. Search Engines Click on Web Analysis > Report Configuration > Custom Reports > Reports or Dimensions To create a report about search engines, Edit a sample profile and click Visitor History. Make sure that Search Engine History is checked.
136
Acquisition Metrics
137
Consideration Are you relying on statistics from organic search engines? Are you using paid search engines? Will you use a email newsletter campaign?
Yes
No
Comments
138
Chapter 8
Conversion Metrics
Introduction
After you have attracted visitors to your web site, you can measure how often the visitors take an action in line with what you intended. In other words, conversion means getting visitors to do what you want. For commercial web sites, conversion usually means how often visitors convert into paying customers. However, many commercial sites are interested in lead generation in which a sales lead may generate a potential conversion to a paying customer later. In either case, the metrics involved with conversion measure the process by which you persuade visitors to take the actions that you intended for them to take. Your conversion rate is a measure of your ability to persuade your visitors to take those actions. The following scenarios are examples of conversion: Visitors purchasing products Prospects registering for more information Customers using your self-service section Investors dowloading your annual report Employees using your internal site to schedule vacations Visitors registering for the sites newsletter or to enter contests The conversion process may involve several steps through your site as visitors navigate their way. Conversion analysis helps you evaluate which types of content successfully support conversion.
Conversion Metrics
139
Figure 8-1 shows a report comparing the number of visits by new and returning visitors to your site.
Monetary considerations
Conversion is the beginning of the rewards for having spent so much time and money on the acquisition step. Retention (discussed in Chapter 9, Retention Metrics on page 159) involves the process of how you minimize the ongoing cost. It is much cheaper to keep a customer happy than to get a new one.
140
Conversion Metrics
141
Path analysis
Where visitors go on your site is actually called path analysis (also known as clickstream analysis). Path analysis lets you discover whether visitors are navigating your site the way you expected them to, and if not, where they are going instead. Path analysis can also help you track movement between pages, or can take advantage of your content group settings to track movement between groups of related content. Different approaches to path analysis provide different types of insight into your visitors activity. You can take a free-form approach and track the top paths starting with the entry page. This analysis lets you know where visitors began and where they went on your web site. Or you can look at the most popular routes on your site. You can also narrow or focus your approach by examining certain hot spots on your site, examining which paths led visitors to hot spots and which paths followed from the hot spot. WebTrends excels at path analysis, providing comprehensive information about the navigation of visitors on your web pages.
Complete path
A complete path means that you track all the pages that a visitor traverses during a visit session. This is virtually the same as manually examining each hit in your web data activity file or your SDC-generated web data activity file. If you took this approach, you would have so much data to interpret that you would never be able to recognize patterns in that data. Plus, the amount of data your system would have to process would tax your servers performance considerably. So how can you narrow down the data on all of the paths?
Focused path
Typically, you know the pages that are of particular interest to you in your sitethe significant pages. So rather than tracking all visitor paths through your site, just track the paths to and/or from significant pages such as entry pages, exit pages, the home page, search pages, shopping cart, or registration pages. Do so would narrow down the scope of how much data youre viewing, providing far more focus than you would get by tracking every page. That is, by considering less data, you have the bandwidth to research deeper. Consequently, you can track to the depth that you want. On anything other than a simple site, you will still encounter so many paths to or from a given page that meaningful patterns in visitor behavior may still be difficult to discern. Its also possible that certain pathsthough technically differentare content-wise the same. Consider Figure 8-2 in which visitors started at different pages to arrive at the Zedesco Search:Search Results page.
142
In addition, it is not always intuitive to look at the progression of pages along a path and easily understand exactly what that behavior indicates. Perhaps instead of seeing visits to the Wireless phones View page in particular, you want to see the level of interest in visits to all product detail pages. This is where you use Content Groups to group related product details pages.
Conversion Metrics
143
144
Conversion Metrics
145
146
Scenario analysis
A more specialized case of path analysis is scenario analysis. This type of analysis helps you discover if people are visiting all the pages in a scenario that you intended for them to visit. You typically have an interest in seeing them complete the steps in the scenario because completion of the scenario often translates into revenue. By telling WebTrends the pages that make up a scenario, you can track how many people started the process and where along the way they dropped out. If dropout rates are significantly high on specific pages, you may consider factors such as poor site design or insufficient information on those pages. Scenario analysis also allows you to exclude from analysis any irrelevant pages that the visitor visits while completing the scenario. This is something that would not be possible if you were
Conversion Metrics
147
simply tracking a specified path through the site. The following is an example of one of the most commonly used web site scenariosan online purchasing scenario, commonly called a shopping cart. The typical shopping cart scenario might include the following steps: 1. Open the shopping cart. 2. Add products to the shopping cart. 3. Start the checkout process. 4. Complete the order. The scenario analysis technique tells you what percentage of visitors who complete one step in the sequence also complete the next step. An obvious example is shopping cart completion, but the technique can be applied to a variety of other scenarios, including applications for services, storefinders, feedback forms, personalization processes, and some kinds of on-site searches. Figure 8-5 shows an Purchase Conversion Funnel report with entry and exit pages.This view shows where people entered the scenario from, and where they went to when they exited the scenario at that step, or abandoned the scenario. For instance, when a visitor leaves a step, visits another page (page X), then leaves the site, page X is shown as the exit page from the last scenario step. Note that in this report: On the left-hand side, you will find the entry pages that lead to one step in the funnel. For more information about entry pages, see Entry/Landing page on page 120. On the right-hand side, you will find the exit pages that show where you visitors went when they left that step in the funnel. For more information about exit pages see Exit Page and Exit Ratio Analysis on page 152.
148
Figure 8-5. Purchase Conversion Funnel report with scenario entry and exit pages
Conversion Metrics
149
In this example, the largest number customers dropped out of the process after opening the shopping cart. Only just over 40% of people who started a shopping cart actually added an item to the cart. Interpreting these results depends on many variables. Whether or not a visitor starts a process, such as a purchase, is often more dependent on merchandising issues and perceived value than on site design. In contrast, whether or not a visitor finishes a process once they have started it usually depends on variables such as clarity or convenience. These variables are well within the control of the site designer. For this reason, scenario analysis of individual processes is an excellent tool for evaluating the effects of changes in the design of a process. After you configure WebTrends, analysis can be done on a before and after basis. Note that in the table that accompanies the funnel graph, the Scenario Analysis Step column lists the names of the steps in the defined scenario. Each step marks progress on the path that is being monitored. The Step Conversion Rate is the percentage of visits converted from the previous step in the scenario. Scenario Conversion Rate indicates the percentage of visits converted from the first step in the scenario. Sometimes the nature of scenarios is non-linear, meaning visitors may enter a step out of sequence. For instance, with a Quick Checkout process, a visitor may be able to jump from step 1 directly to step 4, and would never be counted in steps 2 or 3. Also, in the case of a visitor leaving the site at step 2, then returning later at that same step, this may cause the number of step 2 visitors being greater than those of step 1. WebTrends allows you to view these Step Transitions. This view focuses on how visitors proceeded from one step to the next, or through the scenario. If a visitor proceeded directly from Step 1 to Step 3, Step 3 will appear among the pages listed to the right of Step 1. Figure 8-6 shows the Step Transitions in the Purchase Conversion Funnel report.
150
You should be careful about which pages you select for your scenarios, so that you can determine problems. It pays to think through possible problem areas and to try using those pages as steps in the scenario you want to analyze. For example, you might find that visitors are abandoning your site at the page in which they are asked to state their address. Or they might be dropping out at the page that requests their financial information.
Conversion Metrics
151
Internal Search
Another part of the conversion process takes place after visitors have found their way to a page containing an internal search feature. Visitors can use this search mechanism to find items on your site. Consider stores such as Powells, Amazon, or Barnes & Noble that have an internal search for books (and other items). By examining the keywords and phrases that visitors were searching for, you will learn what your visitors interests are. This information reveals explicit, rather than inferred, implied interest. You now know the words that your visitors are using to describe your content. This information can help you better organize your site, and it can help you to optimize your use of external search engines.
152
Visit-to-exit ratio
The visit-to-exit ratio compares the number of exits from a given page to the number of visits to that same page. It is important to know what percentage of visitors to a page leave directly from that page, because pages that receive the most exits are almost always the most visited pages. To create this ratio for all of your sites pages, simply start with the most important areas on
Conversion Metrics
153
your site. After you have calculated the ratios, you can review the pages with the highest percentage of exits per page view to prioritize the exit pages. This kind of information can often reveal a key page with a high visit-to-exit ratio that does not appear among the top exit pages.
Dead-End Paths
A dead-end path is a path in which the visitor goes from one page, to another, then returns to that original page. Dead-end paths can be both good and bad. In some cases, it can mean that visitors were looking for specific information, assumed that a given link would take them to that information, but upon arrival at the new page, realized that they had not found what they were looking for. This activity means that they are having trouble finding information. A dead-end visit can just as easily mean that the visitor followed a path out to its natural conclusion, and then came back to the previous page to continue looking for other information. A simple example of a good dead-end path can be seen with an online news site. The person opens the main page, clicks on the International News section, and then clicks on a specific article. After reading the article, they return to the International News section to select another story. This is exactly how you would expect these pages to be used.
154
cookie ID, the authuser field, or the IP address. Now lets explore where the visitor information goes. Most online registration forms use the GET method of requesting content. With this method, information entered in the form can be attached as query parameters in the data activity file. There are two ways that these query parameters can then be used to capture visitor information, and they depend on the type of system you have set up to process your web activity data filesa web analysis program or a web data warehouse. Note: The GET method has a limit of 2000 characters. The POST method can also be used, but the content cant be seen in the data activity files. Therefore, the GET method is preferred. In one method, WebTrends parses the hit (in the web activity data file) for the visitor information parameters you specified that it should locate. The WebTrends then takes that information and enters it into a database. With each new hit, the software checks the visitor identifier against visitors already in the database. If the visitor identifier is new, it adds a new row and adds visitor information to that row. If the visitor already exists in the database, the program attaches the hit information to that visitor record. The other method involves the use of a web data warehouse, a database that is designed to hold visitor information. You tell the warehouse which parameters hold specific web visitor information, and the warehouse parses the web data activity file, captures the visitor information, and stores it in a visitor database table within the warehouse. All behavioral information associated with that hit is also tied to the visitor via the visitor ID. Subsequent hits go through the same process. If the ID in the hit matches a visitor that has already been identified, only the behavior information for that visitor is updated. If the visitor has not yet been identified, then a row is added to the visitor table, and all the behavioral information from that hit is associated with that visitor. Note: For more information about warehouses, refer to Chapter 10, Data Integration and Exploration on page 171. Keep in mind that any issues you would encounter using cookie IDs or IP addresses to identify the visitor in visit sessionization, will also occur when using those same items to identify visitors.
Conversion Metrics
155
With SmartView you can get the a sense of where your visitors are going and relate the traffic to the actual visual appearance of the page. Consequently, you can see relationships quickly, even ones you did not anticipate. This may lead you to rethink the pages design or direct you toward new territory for further analysis. You might also want to use SmartView to doublecheck a hunch or an assumption. Since SmartView presents a higher-level and immediate view of the data, you probably will not use SmartView to publish reports on a weekly basis.
156
Summary
Once youve told WebTrends how to identify visitors so that you can associate visitors with their behavior on your site, you can track the paths that those visitors take through your site. In fact, you can track the distinct pages they traverse through your site, and you can use your content group settings to track how they navigate through your site in terms of the types of content they viewed. Tracking pages can be useful in some cases, but typically you are more interested in getting a bigger picture of how visitors use your site. For this reason, you may prefer tracking paths through content groups rather than through pages.
Path Analysis Click on Web Analysis > Profiles & Reports > Edit a profile > Advanced > Path Analysis or Web Analysis > Report Configuration > Path Analysis Scenario Analysis Click on Web Analysis > Profiles & Reports > Edit a profile > Advanced > Scenario Analysis or Web Analysis > Report Configuration > Scenario Analysis Shopping Carts Web Analysis > Report Configuration > Scenario Analysis To create a report using shopping carts, Edit a sample profile and click Visitor History. Make sure that Purchase History is checked. Search Engines Click on Web Analysis > Report Configuration > Custom Reports > Dimensions To create a report about search engines, Edit a sample profile and click Visitor History. Make sure that Search Engine History is checked.
Conversion Metrics
157
Conversion Worksheet
Use the following worksheet to understand how well visitors are converted on your site.
Consideration Identify the top 5 key pages in your site that you want to see traffic moving to. What are the paths moving to and from those pages? Identify the scenarios (especially any registration or checkout pages) in your site. If you have an internal search feature, do the most popular keywords and phrases really fit your product? Are there other words that visitors should use? Should keywords be listed on a search page or other pages to help visitors make the associations you want them to make? Identify your dead-end pages. What is the meaning of each dead-end page? What kind of program can you set up to periodically measure the conversion rate to see if improvement has occurred?
Comments
158
Chapter 9
Retention Metrics
Introduction
The vast majority of web sites need to retain their visitors. Youve gone through a lot of hard work and expense to attract visitors and convert them into buyers or registered users. Now its time to keep those visitors. From a monetary perspective, retention involves the process of how you minimize the ongoing cost. It is much cheaper to keep a customer happy than to get a new one. Customers who return again and again have the highest value, which translates into profits for commercial businesses. To make retention work for you, you must find out more about your visitors and their behavior. Understanding your visitors and their behavior will help to answer the following questions: On which visitors should you spend marketing dollars? When? What can you expect in future sales from your existing visitors? How do you predict which ads and products generate the best visitors? What kind of incentives should you provide to get a visitor to do something you want them to? Can you predict which visitors will be responsive to your program? Should some visitors be contacted more often than others? How can you put a value on your visitors and business as a whole, and project this value into the future? Visitor retention activities are an investmentwith the expectation that the value of the investment will rise. But initially youve got to know more about your visitors and their behavior.
Retention Metrics
159
Once youve identified the behavior of specific population segments on your web site, what now? This level of insight into your web visitor allows you to take action, if needed, to better capture the audience you want to attract. This is the information that lets you implement a continuous improvement cycle-you measure the activity for a given offer or ad campaign, make a decision based on that measure, take some action based on the decision, then you re-
160
measure to see what effect the action had. Lets consider what might happen with a scenario in which a wireless phone company uses a cellular phone package to target 18 to 25-year-olds. The company might run an advertisement that web visitors access via promotions on ten different sites. These ten web sites were chosen because they are sites geared toward a younger crowd. When visitors link to the ad, before learning more about the package, they are prompted to fill out a survey that requests information on their age, sex, zip code (if applicable), and current occupation. After one week, the cellular phone company reviews which referring sites tended to send the greatest number of 18 to 25-year-oldsthe target audience. At that point, the company continues paying for the promotion on sites that referred the most targeted visitors, but discontinues the ad on those sites that failed to do so. By tying web behavior to their web visitor, the cell phone company was able to quickly identify where their marketing dollars were effectively being spent, and where they were wasting their money. Even if you only learn about the behavior of visitors, you can move ahead. For example, you can compare the repeat rate of visitors generated by different banner ads or keyword phrases.
Recency Number of days since the most recent visit of a visitor. Note that zero recency means that the visitor visited within less than 24 hours. Most businesses find recent customers to be more valuable than customers whose activity has been dormant for a long time. Frequency Number of visits since the visitor was first tracked. Theres a great deal of difference in value between a 100-time repeat visitor and a 2-time visitor. Latency Number of days between visits for visitors. Note that zero latency means that the visitor visited every day. Latency can be especially helpful for businesses where orders and contacts have a defined cycle (for example, a subscription-based business and businesses selling durable goods or high ticket items).
All three measurements can be used to determine the potential value of your visitors.
Retention Metrics
161
Lifetime Value
Lifetime value is a concept that applies to commercial web sites, because these sites need a long-term gauge for their repeat customers. Lifetime value represents the total sales generated since tracking a specific visitor began. Figure 9-2 shows the lifetime value of visitors to the Zedesco web site.
Reports that reveal lifetime value have a great influence on the types of offers you might present your visitors. For example, the report in Figure 9-3 shows the lifetime value of buyers for the most recent campaign they responded too, and displays it in a drilldown. A drilldown enables users to examine this information at a highly summarized level, and navigate to successively more detailed levels of campaign data; for example, viewing lifetime value of buyers by demand channels, partners, marketing programs, marketing activities, campaign IDs, campaign descriptions and more.
162
If you run this report again a few months later and find that the average latency for most of your customers is increasing, then you will want to take action to correct this behavior.
Retention Metrics
163
Visitor History
WebTrends allows you to collect the behavior of individual visitors over a period of time. This is called visitor history, and it is primarily used to track the activity of visitors purchasing behavior such as how well visitors have responded to advertisements, how much money they spent, how many times they bought something, and how many items they bought.
Purchase count Lifetime count of purchases from shopping cart Most recent purchase value The value of the most recent purchase Days before first purchase The number of days between a visitors first visit and first purchase Days since first purchase The number of days since a visitors first purchase Days since most recent purchase The number of days since a visitor has purchased an item
In other words, visitor history allows you to measure visitor activity according to recency, frequency, latency, and lifetime value. Visitor history can help you to find out which customers you might lose. For example, the information you get from visitor history might cause your marketing departments to send special offers to customers who havent been active for a while. In general, visitor history can help you to convert one-time users into frequent users.
164
The visitor history records are stored in the visitor history database, which is under the hood of WebTrends. That is, you dont see it or have to worry about it. The only thing you have to do is make sure that you activate the visitor history checkbox in the UI if you need visitor history for some analysis. The procedure is detailed in the Finding the Features in WebTrends Products on page 168.
Retention Metrics
165
WebTrends stores aggregated information about purchases. This aggregation is sophisticated enough to make fine distinctions such as invoice rejection. For example, if a visitor goes to a shopping cart site and accidentally submits twice on a purchase page, WebTrends can detect the unintended action and make sure that it will be counted once instead of twice. WebTrends can also detect an accidental bookmark to a purchase page and count that visit properly.
166
6) At-risk visitors/customers To find out about past visitors who have not been to a site in a number of days, you can use the recency metric and then decide if you would like to appeal to them (perhaps based on previous loyalty) with special offers.
Retention Metrics
167
After you have defined your unique visitors, you may be interested in certain groups of these visitors, such as those who have a lifetime value of at least $500. Or you could look at unique visitors who have a recency of once a day or once a week, and compare their lifetime values. In any case, by tracking the activity of these groups of unique visitors, you can adjust your marketing efforts and make special offers based on the information you find. However, if you have a web site with heavy traffic, there is no way you can keep a complete list of every visitor who has touched every page, every content group, etc., because the record keeping quickly expands exponentially into unmanageable lists. The issue is counting uniqueness. This means that you have to have a record for everybody who did something. Counting uniqueness translates into maintaining a complete list of visitors who performed a specific action. Then maintaining another list for another page. The numbers for each page get very large very quickly. For example, a web site with a million visitors and ten thousand pages has ten billion combinations to contend with. And thats just for pages! The enormity of the problem of counting uniqueness affects web sites with fewer pages and visitors, too, because many of these sites want to know how many visitors touched their pages during a particular week or a particular month. That involves a time dimension. The numbers of records needed to keep track of this activity has just skyrocketed. Fortunately, with WebTrends, you can track visitor uniqueness over a period of time (daily, weekly, monthly, etc.) and begin to interact with your customers on a more individual basis.
168
Retention Worksheet
Use the following worksheet to understand how well the retention of visitors is going on your site.
Consideration Comments
On which visitors should you spend marketing dollars? When? How often? When launching ads, do you target specific visitors or send out general information to all visitors? Which visitors will be responsive to your programs?
Which visitors should be contacted more often than others? How can you put a value on your visitors and business as a whole, and project this value into the future?
Retention Metrics
169
170
Chapter 10
171
172
Demographic data
Perhaps you have the state associated with each web visitor record, and you want to tie that activity into a database that describes demographics by state. Numerous databases exist that can help you segment your visitor population. For example, WebTrends GeoTrends provides demographic information. Lets consider a straightforward scenario: Zedescos budget limits them to airing a TV commercial in only one state. If they are using their web site as a basis for deciding in which state to air the commercial, what information might they need? One of the most basic pieces of data they could look at is which states show the most web viewing activity, such as the most page views or the most visits. If two states show similar activity levels, the next step might be to see which state has the most buying power. To do this, they could tie into a demographic database that contains information on average income level by state. If they find that between the two states showing the most activity one has a lower average annual income, then assuming all other variables are equal, theyd air the advertisement in the wealthier state.
Customer databases
Joining web visitor information to web visitor activity is useful for marketing professionals as they try to more accurately target their marketing using the web. But you can also use your web activity and web visitor data for account management. You do this by joining the web activity of individual web visitors with their account contact data in Customer Relationship Management (CRM) systems such as Siebel Call Center or PeopleSoft. CRM systems are database-driven applications that are generally used to manage the information about an organizations prospects and customers. These systems often contain information about customers or customer prospects, such as: Correspondence Contact information Previous transaction information Communication via email, phone, or regular mail Joining web visitor and web activity data to complex databases such as those used by CRM systems requires the structure of a web data warehouse. To join the two sets of data, you need one or more shared keys, or IDs, to match the records in one database with records in the other. Typically, this will be some visitor ID in the web activity database, and a customer ID in the call center database. Other possible shared keys between the two databases could be combinations of first and last names or email addresses. Figure 10-1 illustrates the shared keys between two databases.
173
Joining web activity with visitor information lets salespeople understand their visitors interests with information such as: Which web pages they visited How many times they visited those pages How long they stayed Which products or topics they researched How much information and interest they have about specific products as evidenced by the white papers, demos, or other marketing and technical materials they downloaded from the web site Service professionals can also use this combination of information to review a customer's web activity to prepare them for handling the customers issue. Useful information includes troubleshooting topics, frequently asked questions, or technical white papers that the customer has already examined. In addition, by reviewing how often specific troubleshooting topics or frequently asked questions are accessed, support organizations can determine if products or documentation have weaknesses or other issues that need to be addressed. Figure 10-2 shows an environment that is running machines that use web analysis and warehouse data. In this illustration, the client machine is able to view reports on the warehouse using a reporting application such as Crystal Reports. The warehouse can commu-
174
nicate with other sources of data, such as CRM or Enterprise Resource Planning (ERP) and wed that information with the warehouse data.
175
176
An Excel Wizard takes you through several easy-to-use steps before generating the report. Its important to be aware that the more dimensions and the longer the time period you specify and export into Excel, the more calculations that must be performed and the harder your system has to work. Important: Excel is limited to 65,000 rows of data.
177
Subcategory Level 2 3-season 4-season Backpacking Car Camping Mens Womens Mens Womens Internal Frame External Frame Inflatable Non-inflatable Inflatable Non-inflatable
Hiking
Boating
Kayaks Canoes
Within WebTrends reports, you can interactively click on a given dimension and drill down to the next level. For example, if instead of examining all product categories (Camping, Hiking, and Boating) you only wanted to view information about the Hiking category, you could simply click on the Hiking Product category, and view information about Boots, Clothing, and Backpacks. Within Excel, you can drill as far as you have specified in WebTrends drilldowns. For instanceusing the example abovewithin the Hiking product category, you could drill down three levels, and examine visits to pages in the Internal Frame subcategory of the Backpacks subcategory. Figure 10-4 shows an Excel spreadsheet with categories and subcategories.
178
179
Data exploration
With Excels tools, you can choose the exact dimensions and measures you want to compare, and you can discover significant correlations between dimensions. These tools use automated machine learning and statistics to uncover trends, which Excel can present in a variety of graphs, tables, and charts.
180
Data exploration is an iterative process. You will need someone who is adept at statistics and is willing to look at the same data again and again in order to find the nuggets in the data. Figure 10-6 shows an Excel chart with trend data mapping campaigns by sum of gross revenue for December 2003. This is an example of charting data that is calculated in Excel and shown in a graphical format.
Figure 10-7 presents another Excel chart of trend data mapping. Note that you can use PivotTable reports to filter the data by group, department, etc., and that this filtering can change the visual representation in the graph.
181
Figure 10-8 shows the calculation of Gross Margin Return on Investment for various demand channels. External data such as Marketing Cost Per Click and actual product costs were added to the original WebTrends data and then used to calculate the GMROI.
182
Another data exploration exercise might involve examining relationships between visitor attribute data-income level, zip code, gender and the content groups and ad campaigns visited. To do this, you would have Excel compare each visitor attribute and combination of visitor attributes against content groups, against the combination of content groups and ad campaigns, and then against ad campaigns. But practically speaking, what are the benefits of data exploration? Data exploration can be used to reveal significant trends in customer behavior. For example, with an online travel site, women from zip code 97215 with an annual income of $70K visit the last minute deals pages and respond to e-mail ad campaigns more than any other visitor population segment. Knowing this, you might choose to send out a targeted email for a last minute deal, and then use standard web analysis reporting to see if that e-mail campaign is effective.
183
Your web site does not have to register a million hits to make data exploration cost effective. Its more about the money attached to your traffic than the total amount of traffic. Data exploration can be a cost effective solution for web sites with a lot of money riding on a small amount of traffic. Data exploration will give you a lot more insight at a higher (and deeper) level, but the exploration involved can be expensive. You may be exploring many avenues before you reach the right one(s) (for example, by using A/B testing); so youll need some intelligence to figure out which way to go. Since data exploration is very open-ended, you need to narrow down the many possibilities and achieve meaningful results. Consequently, a data exploration solution for you company doesnt mean that you merely purchase more software, plug it in and watch your income grow. You will have to look hard at adding the right kind of personnel who will work hard to interpret the data.
184
Consideration Do you have external data that you want connected to web behavior? Can you afford a web data warehouse in terms of costs relating to people, software, hardware, and planning? Will there be compatibility issues if you bring any previous-existing external data into the warehouse? Do you have data that you need to investigate in Excel? Do you have Excel experts who know how to work with PivotTables?
Yes
No
Comments
185
186
Chapter 11
187
in size. If your site experiences up to 5,000,000 hits per day (an amount of web traffic that is not unusual for enterprise-level organizations) your web data activity file size can easily be several gigabytes in size. Evidence shows that for large organizations with extremely active web sites, generating terabytes of data per year is common. Because data activity file sizes for even a daily web data activity file can require gigabytes of storage space, most organizations implement a log file rotation scheme that keeps computing resources available for processing tasks. Depending on the volume of web traffic that your site experiences, you may wish to rotate/rollover web data activity files daily, weekly, or monthly. Note: When IIS servers rollover on a daily basis, they close out one log file and start another at 12:00 am GMT, not at midnight local time. Note: You can review the process of log file rotation/rollover in Log file rotation/rollover on page 45. Figure 11-1 shows a basic overview of log file rotation, rollover, and archiving.
188
Rotation schedules can also depend on how you access your web data activity files, and how often you intend to report on those web data activity files. If you use FTP to access your web data activity files and you generate reports hourly, then you must rotate your web data activity files hourly. Hourly rotation is necessary because in order to run reports, the web data activity file must first be transferred to the local, analysis machine. With a mapped drive, the transfer is not required because to your system, the drive already appears to be local. Therefore, whenever reports are scheduled to run, WebTrends does not need to transfer an entire file, because the file, for all intents and purposes, is local. Typically, organizations rotate their web data activity files daily. Unless you need to generate hourly or more frequently, daily rotation is usually a good rule of thumb. But once youve rotated the files out and analyzed them, you need to determine how long to archive them. The length of archival depends on your reasons for holding onto the data. Some organizations dont intend to ever re-analyze their data, and consequently throw out the data shortly after the analysis. Other organizations hold onto their data forever. For most organizations, a basic rule of thumb is to archive data for a quarter up to one year.
Recommendations
Rotate web data activity files dailyyet consider hourly rotation if you access your web data activity files via FTP, and if your site experiences a considerable amount of traffic. Archive analyzed web data activity files for one year.
189
the data up to the last known good copy of it, you will need to fill in the data that was not contained in that backup. This requires you to reload and re-analyze the raw web data activity files for the data from the time of the backup to the most current web data activity file. Lets go back to the earlier example in which the content group was incorrectly set up. If your web site experiences a significant amount of traffic, and for that reason, each daily web data activity file analysis requires around 10 minutes to run, you might determine that you could afford the time it would take to re-analyze up to twenty-eight days of data at any given time. You also feel that 28 days is enough time to discover any issues considering that you review reports once a week. Your storage capabilities allow you to have four backups of the data. This means that when a fifth backup is created, it replaces the oldest backup. With this situation, a sensible solution could be to back the data up every seven days, and maintain four backups. This allows you to maximize the amount of storage space you have, yet assure that you will catch any problems with the data long before your oldest archive is overwritten. This means that given the following situation (shown in Figure 11-2): Archive 1 Archive 2 New content group with syntax problem added one day after Archive 2 was created Archive 3 Syntax problem discovered three days after Archive 3 was created
You have two options: 1. Correct the syntax for the new content group and re-analyze the data, and then go back and import all the raw web activity data from day one (assuming you still have those web
190
data activity files). 2. Go back to the last known good set of summary tables and then re-analyze the data from that day up to the current day. In this case, you would restore Archive 2, the last archive that contained data without the syntax problem, correct the syntax for the new content group, and then you would re-analyze the raw web data activity file data up to the current day. As you can imagine, creating and maintaining multiple backup copies of an entire database can require substantial storage space on your computer. Its important to consider the tradeoff between the storage space you have available and how many backup copies you can afford to keep around at any given time. This trade-off is also affected by how long it would take to restore lost data, which in turn is impacted by how much traffic your site experiences, which summary tables you choose to create, and how powerful your system is. How often you may need to backup data also depends on how closely you monitor the results of your data. If you only review results once a day, then creating daily backups, or a backup every couple of days might be fine because you will probably catch any issues within a few days.
Recommendations
Check how much disk storage space you have to save the backups versus the average size of a backup. Determine how long it takes to restore data by analyzing it from the raw web data activity file. This is affected by how much traffic your site generates, which summary tables you choose to create (daily, weekly, monthly, etc.), and how fast your system can process the data. Figure out how soon you are likely to catch issues that may necessitate restoring a backup by how closely and frequently you monitor your analysis results.
191
uncompressed and placed in a temporary storage location, or cache, that is located on the analysis machine or at least on a drive that is mapped to the local machine so that it appears to be local to the machine. The web data activity files are accessed from this cache during analysis, but at the end of analysis, you need to decide what to do with the uncompressed files. If you suspect that you will run many analyses on the uncompressed files, it makes sense to hold them, in uncompressed form, in the cache. This saves the time required to transfer them to the cache and unzip them. For web data activity files of any significant size, this time savings can add up. On the other hand, if you are fairly certain that you will not use the file again, you dont need to use space on your machine to save those files. Depending on how your WebTrends software approaches this cache situation, you may have the choice to: Delete the file from cache upon completion of the analysis. Keep the file in cache for a specified number of days. Keep the file in cache until the cache reaches a maximum size, at which point the oldest files in the cache will be replace by new, incoming files. Keep the file in the cache, but delete it if it is not accessed within a specified period of days.
Recommendations
If you do not plan to re-analyze a web data activity file, you can save space on your local machine by choosing to delete it immediately upon completion of analysis. If you suspect that you will re-analyze your web data activity files, configure your software to maintain the uncompressed version of your files in a local cache for a specified period of time or until the cache reaches a maximum size.
192
designate for storing web data activity files. And you will likely have the same choices you had when deciding how to handle the web data activity file you accessed via FTP. Namely: Delete the file from cache upon completion of the analysis. Keep the file in cache for a specified number of days. Keep the file in cache until the cache reaches a maximum size, at which point the oldest files in the cache will be replaced by new, incoming files. Keep the file in the cache, but delete the file if it is not accessed within a specified period of days.
Internet resolution
When your web server generates a web data activity file, it can either be configured to look up the client machines IP address as it creates the web data activity file in a process known as reverse DNS, or it can leave the IP address unresolved. The more efficient approach is to look up the IP address during web data activity file creation; however, because this process (known as Internet resolution) takes some of the servers resources to perform this lookup, web site content delivery may be negatively affected. For this reason, many web servers are not configured to perform a lookup. The reality is that when reviewing reports about your web visitors, just receiving the IP address of your visitor does not give you much insight. An IP address cant let you easily see that many of your visitors come from the competition, or that many of your visitors come from a company with whom you are trying to establish more business. IP addresses also affect visitor counts, because multiple IP addresses can resolve to the same domain name. WebTrends software gives you the option to look up IP addresses from DNS servers. Once looked up, these IP addresses are stored in a cache so that future analyses can grab that information locally, rather than having to go through DNS servers to locate the information. You need to determine the value of having IP addresses translated into meaningful names versus the loss of disk space that the cache of resolved addresses occupies. Typically, cached addresses have a maximum size, and when that cache limit is reached, the oldest entries get deleted to make room for the most recent. In addition, you need to weigh the impact on performance that looking up IP addresses will have on your analysis system.
193
Recommendations
Determine how important it is to have the looked up values of IP addresses in your reports. The space required by these looked up values can be fairly minimal, but the performance slowdown can be noticeable. Most people tend to have the lookup performed if the web server did not already do this. Note that a company may use many IP addresses that are assigned to them but only register a few of these addresses as domains. For example, a company may have many proxy servers with addresses that connect to the Internet, yet since the company doesnt expect anyone to connect to the proxy, it hasnt assigned a domain to the proxy. Consider using WebTrends GeoTrends, which will resolve IP addresses more accurately than DNS. That is, GeoTrends identifies the companies that registered the IP addresses. GeoTrends also provides pertinent geographical and demographical information for your web analysis.
Determine how important it is to have the HTML page titles of the URLs in your reports. The space required by these looked-up values can be fairly minimal, but the performance slowdown can be noticeable. Most people tend to perform the lookup to make reports more meaningful. Note: Web site security can impede or prevent HTML title lookups. You may need to configure a username and password to get the data.
194
Table limiting
Your system only has so much physical memory (called random access memory or RAM) in which to store the results of analysis. When data requirements exceed that memory, it has to use virtual memory, exchanging data as needed from RAM to the hard disk and back to RAM. This can create a low performance situation known as thrashing, in which a lot of activity is going on (swapping pages of data in and out of RAM), but little is being accomplished. Unfortunately, there is no perfect solution to the issue of overwhelming your memory with data. However, there are measures you can take to reduce how often your system has to swap data out to the disk. You can add more RAM, which up to a point will increase performance. Yet after you have added 2 GB of RAM there is no additional benefit. Note: Most normal computers these days (that is, those with 32-bit processors) can address only 4 GB of memory (that is virtual address space, regardless of how much physical RAM you might have), and they usually divide that 1/2 for user process 1/2 for the operating system. So, 2 GB is a per-process limit. You could put 4 GB (or more) in a machine and two user processes (that is, two programs running simultaneously) can each use 2GB of physical RAM simultaneously. Some of the Windows versions (for example, the higher-end ones, such as Windows 200 Advanced Server) can be configured to provide 3 GB of memory for user processes and 1 GB for the OS. WebTrends can use 3 GB if available. A second approach that may be used by WebTrends software is to make smarter decisions about the data to swap out to RAM. By swapping out those items that most likely will not be needed in the future, the amount of time your system needs to access the hard disk is reduced. Another approach is to limit the amount of data that you store in your summary database tables. The trade-off with this approach is that by limiting the amount of entries in a summary table, you only collect records up to the point that you reach that limit. For example, if you limit the top pages table to 10,000 pages, then data will only be aggregated for the first 10,000 pages entered in the table. Any new pages encountered in the web data activity file after that will not be entered in the table. This means that if your site experiences a great deal of traffic and has 200,000 or 300,000 pages, then limiting it to the top 100,00 will significantly reduce the accuracy of your reports. However, if you were to perhaps limit it to the top 50,000, you might expect to get a reasonably accurate representation of the top pages in your reports. In addition to requiring less storage space in RAM, limiting tables also reduces the time spent inserting data into the database. This time savings is fairly minimal in comparison to the time savings achieved by avoiding swapping data out to the hard disk. Whether you have to limit table sizes depends on three factors:
195
System processing speed Amount of RAM Tables being created (daily, weekly, monthly, quarterly, and/or yearly) System processing speed impacts how long the instructions and data must stay in main memory, while the amount of RAM affects how much data can be kept in main memory at any given time. And finally, the periods for which you have chosen to generate reports determine which tables exist and have data aggregated in them. If you have selected to aggregate data in yearly tables, toward the end of a year, you would be maintaining almost an entire years worth of data. Because the summary tables have to be loaded in RAM to aggregate the data, the larger the amount of data, the more likely that you may have to swap out to hard disk.
Recommendations
If you trade accuracy for speed, you need to be certain that you really need that report. Use WebTrends software to limit the number of elements that are fed into the tables. Also, you can limit tables for your custom reports.
Performance issues
Simultaneous analysis
Many web analysis applications are multi-threaded applications, meaning that they can run multiple processes simultaneously. Depending on the number and speed of the processors and memory in your analysis system, you may increase performance by running more than one analysis at a time.
Recommendations
Have no more than one simultaneous analysis for each processor in the analysis system. Each processor should have at least 2 GB of RAM.
196
How long to keep a given report do you hold onto each daily report for one month, two months, or longer? How many elements to store in a report 100, 2000, or 20,000? Reporting is one of the key elements to consider when deciding how to allocate resources, because the report rendering process itself demands a lot from your systems performance and after youve created those reports, each one requires a fair amount of storage space. Rendering reports is a fairly processing-intensive task. The report engine must first look up all the information requested by the report templates. It must then create tables and graphs that are populated with all the requested information. Depending on the report periods requested (such as daily, monthly, and yearly) your report engine may have one or more different reports to render for each report type. Keep in mind that each stored report can occupy a fair amount of memoryup to 1 MB of memory, for example, for a basic report that comes packaged with WebTrends software. Therefore, always consider the amount of time and resources involved in generating reports. For example, if it takes an hour to generate a complete days report and if you did it every hour, it would take more than an hour to generate the report, because of overhead involved in shutting down processes and starting up processes. Your system might also experience thrashing if you generated reports too frequently.
Recommendation
Many IT departments prune reports to contain only the tables/charts that may be of interest to the particular audience. Culling the reports makes them less daunting, more accessible, and reduces processing time and storage needs. You should track which reports are viewed by business users and then remove those that are never accessed.
197
Keep only eight quarterly reports and two yearly reports. By limiting the number of reports to keep in the On Demand Database, you reduce the storage space required. Theres a trade-off between keeping massive amounts of data and maintaining a robust database that generates reports efficiently. Some organizations may find great value in keeping a lot of historical datano matter what the cost is. Other organizations may find that maintaining daily reports from the previous year to be of little value. Its a matter of what your organization needs and can afford.
Archiving Click on Administration > System Management > Backup/Restore > Restore Backup Internet resolution Click on Web Analysis > Options > Analysis >Internet Resolution HTML page title lookups Click on Web Analysis > Options > Analysis > General You will see Retrieve HTML page title. Table limiting Click on Web Analysis > Options > Analysis > Table Limiting Report database Click on Administration > System Management > Data Retention > Report Database Elements in report tables in standard tables Click on Web Analysis > Report Designer > Options > Reports Elements in custom reports Click on Web Analysis > Report Configuration > Custom Reports > Reports > Dimensions
198
Optimizing Worksheet
Use the following worksheet to help optimize your analysis environment.
Consideration Do you plan to archive your web data activity files and do you know how long you will keep them archived? Do you have adequate storage space for the archived files? Do you plan to backup analysis data, including summary tables? Do you have adequate storage space for backup data? Do you plan to cache uncompressed web data activity files for re-analysis? Do you plan to use IP address lookup (aka reverse DNS)? Can you improve your system performance if it slows down because of IP address lookup? Yes No Comments
199
Consideration Do you plan to look up HTML page titles? Can you improve your system performance if it slows down because of HTML page title lookups? Have you maximized the size of your RAM? Can you limit the size of your summary tables? Can you limit the size of your reports?
Yes
No
Comments
200
Glossary
Abandonment Rate For a scenario or multi-step process, the percentage of initiated scenarios that were not completed during the visit. Scenarios can be defined many waysfor example, the entire shopping process, a finite checkout process at an ecommerce site, a registration process at a lead generation site, or a search process at an information site. Acknowledgement Page A page that is displayed after a visitor completes an action or transaction: for example, a Thank-you or Receipt Page. An Acknowledgement Page is often important in Scenario Analysis, where it is an indicator of a completed scenario. Acquisition The process of attracting a visitor to your web site. Activity A general term referring to nearly any site measurable, including visits, hits, visitors, and viewing time. A link, usually commercial in nature, consisting of a graphic or text that takes a visitor to a web site when clicked on. An abbreviation for advertisement.
Ad
Ad Campaign A specific effort to attract visitors to your site through ads. It may be one individual ad or a coordinated set of ads treated as one entity for reporting purposes. On the web, ad campaigns usually consist of e-mails, graphics on other sites or on a wireless interactive appliance, and traditional media such as direct mail, print, broadcast, outdoor advertising, etc. In WebTrends, ad campaigns are set up by the reporting administrator with a unique URL/landing page, a starting date, an ending date, and a cost. Same as Campaign and Marketing Campaign.
Glossary
201
Ad Click A click on an ad resulting in a jump to the site being advertised. Ad View A display of an ad on a page that is viewed during a visit. There may be more than one ad view on a page. Address An Internet term loosely referring to the location of a web site or web page on the Internet or the Web. Or, more specifically, an identifier for a specific computer that is connected to the Internet. Aggregate Combining data of two or more dimensions in a report. For example, adding up all Departments to get Total Division data. While such combinations are normally sums, any type of formula might be used. Authenticated User A visitor who used a username-password login process to get access to all or part of a web site. The username (but not the password) is captured in a specific field in web site log files or through client-side data collection tags. Since it is possible for many different unique visitors to have the same IP address, authenticated username is perhaps the most accurate way to count unique visitors. You may find more authenticated user names than total visitors because several persons may be using the same IP address; this is particularly common on corporate Intranets where a large number of visitors are sharing a smaller pool of IP addresses. Authentication Technique that limits access to Internet or intranet resources to visitors who identify themselves by entering a user name and password. Average A statistical term referring to the sum of a measure divided by the number of items measured. For example, for a series of 11 visits consisting of 3, 7, 7, 7, 8, 10, 15, 22, 25, 25, and 35 page views each, the average number of page views is 14.9 (total 164 divided by 11), the median is 10 (the 6th in the series of 11) and the mode is 7. In statistics, average is also called the mean.
202
Average Frequency The average of the frequencies of all the visitors during the reporting period, where each visitors frequency is the number of times they have visited the site since WebTrends visitor tracking began. Average Latency The average of the latencies of all the visitors during the reporting period, where each visitors latency is the average elapsed time, in days, between all their visits since WebTrends visitor tracking began. Average Lifetime Value The average of the lifetime values of all the visitors during the reporting period, where each visitors lifetime value is the total monetary value of a visitors past orders since WebTrends visitor tracking began. Average Recency The average of the recency values of all the visitors during the reporting period, where each visitors recency is the averaged elapsed time, in days, since their last visit. Banner, Banner Ad An online advertisement, usually a graphic, which can be anywhere on a web page but typically refers to a horizontally elongated graphic of significant size located at the top or bottom of a web page. Bookmark In a browser, a shortcut to a web site page that is created by the visitor to allow a quick one-click return to the page in the future. Bookmarks are called Favorites in some browsers. Visitors arriving at a site by clicking on a bookmark will appear as a Direct Traffic entry in Referrers reports. Browser A program - such as Microsoft Internet Explorer and Netscape - used to locate and view web pages as well as to follow hyperlinks. The Browser is identified in the Agent or User Agent field of a web site log or through standard clientside data collection tags. Campaign A specific advertising effort to attract visitors to your site. A campaign may be one individual ad or a coordinated set of ads treated as one entity for reporting
Glossary
203
purposes. For online channels, campaigns usually consist of e-mails, graphics on another site or on a wireless interactive appliance, and traditional media such as direct mail, print, broadcast, outdoor advertising, etc. In WebTrends, campaigns are set up by the reporting administrator with a unique URL/landing page, a starting date, an ending date, and a cost. Same as Ad Campaign and Marketing Campaign. Campaign Creative A creative describes the characteristics of a marketing activity, such as color, size and messaging; for example, a Buy Now graphic. These creative elements are used to encourage clickthrough to the web site. Campaign Creative is a level within the drilldown categorization scheme set up by the WebTrends administrator, which allows for reporting on groups of campaigns in a way that is meaningful to the report users. Campaign Drilldown In certain WebTrends reports, a drill-down feature allows the user to navigate from a highly summarized level of data to successively more detailed levels of data, organized along a concept hierarchy. With Campaign Drilldown, users can examine visits, page views, revenue, average order size, and more, by Campaign Partner, Demand Channel, Marketing Program, Marketing Activity, Campaign Name, Campaign Creative, Campaign Offer, and other campaign attributes. Campaign ID A unique campaign identifier used to calculate campaign success, cost, etc., which may involve several different marketing activities, or a single effort. Campaign ID is a level within the drilldown categorization scheme set up by the WebTrends administrator, which allows for reporting on groups of campaigns in a way that is meaningful to the report users. Campaign Type This is a user-defined category, which might include online banner ads, emarketing newsletters, and direct mail campaigns. Campaign Type is a level within the drilldown categorization scheme set up by the WebTrends administrator, which allows for reporting on groups of campaigns in a way that is meaningful to the report users. Checkout Page The page or series of pages viewed when a visitor goes through the process of buying something online.
204
Child Profile WebTrends can use Child Profiles to report on a web site that shares a log file with other unrelated sites due to a constraint or choice by a hosting provider. Child profiles can be helpful if an ISP or web hosting service hosts multiple customer sites on their web servers. To a web site visitor, a customers site can appear as a distinct, stand-alone domain, but often the web activity data for each customer site is recorded and lumped together in the service providers main web server log file. If service providers want to offer their customers a set of basic web activity reports with data specific to each customers site, they need a means of breaking out data by customer. Because service providers also want to reduce management and maintenance of this data splitting process, they want WebTrends to auto-discover and split out these data subsets while parsing the log file. Parent-Child profiles provide this auto-discovery functionality, and also creates profiles, called Child profiles, for these data subsets. Click The act of activating a hyperlink, usually by physically pressing down (clicking) on a mouse button when the cursor is over a link on a page. In Web advertising, a click is an instance of a user activating an advertising link to go to an advertisers web site or page.
Click-through-Rate The number of clicks on an ad as a percentage of the total views of the ad during the reporting period. Client A computer (or software on a computer) that accesses resources provided by another computer, called a server.
Client Errors An error occurring due to an invalid request by the visitor's browser. Client errors are in the 400 range (see Status Code on page 227 for a list). Client-side Data Collection An alternative to traditional web server log file analysis that involves collecting data directly from the visitor's browser (the client) rather than from server log files, improving data accuracy. Special script in a pages source code is used to
Glossary
205
transmit page-level data, not hit-level data, to a data collection server, dramatically reducing data volume and decreasing processing time. Client-side data collection obtains more accurate information than log files doby accurately tracking visitor activity normally hidden by browsers local cache and proxy and caching servers like those used with an AOL accountas well as by collecting extra, customized data not included in normal web server log files. Accuracy is also improved since spiders do not trigger client-side tags; with log files, spiders can appear to be real visitors unless their activity is filtered out. However, client-side methods provide no information on server technical performance or bandwidth use. WebTrends proprietary client-side data collection technology is called SmartSource. Combined Log File Format A basic (common) log file with two additional fields, the Referrer and User Agent fields. Also referred to as Extended Log File Format. Content Group An administrator-defined group of one or more web pages that is treated as one entity in certain reports such as Content Groups and Content Paths. Content Groups are created by a WebTrends administrator to group pages according to similarities that are meaningful in the context of your web site. Content Path A consecutive sequence of two or more Content Groups viewed during a visit. Conversion, Conversion Rate The percent of a group (of visits or visitors) that took a specific action of interest. The term Conversion can apply to any type of action a web site wants its visitors to perform, and any type of goal or mission a visitor wants to complete on the site. Conversion can encompass the entire visit population, such as the percent of all visits that involved a completed registration. Conversion can also refer to a very small and precise action, such as the percent of people at step 3 of a scenario who continued to step 4; or it can apply to a subpopulation, such as the percent of knowledgebase searches that result in issue resolution. Cookie When a users browser requests a page from a web site server, the server often returns a cookie, a small text file sent to a browser by a web site to be stored locally.
206
In its simplest form, this text file usually contains a long unique string of characters that helps the web site recognize that visitor when he/she makes subsequent page requests. One purpose of a cookie is to let the server keep track of important information through the course of a visit, such as the items added to a shopping cart by a visitor. Without a cookie, many online transactions would not be possible because the web site would not be able to associate information entered on the shipping address page with information entered on the payment page, as one example. The browser user controls whether a browser accepts cookies or not. If the browser is set to accept cookies, WebTrends uses the cookie character string to divide the mass of page views into individual visits. If a cookie is the persistent type that is stored on the clients hard disk, WebTrends also uses the cookie to define a visitor as either first-time or returning. WebTrends can also use the cookie to associate previous visits with a particular visitor in order to report on past purchases, lifetime value, or past responses to campaigns.
Custom Filter A hit or visit filter created in the Custom Reports section of the WebTrends Admin Console. Custom filters can be a variation of a filter already in use or can be completely new, based on a variety of hit or visit characteristics. Visit-related custom filters are especially powerful, allowing the inclusion or exclusion of entire visits depending on whether a specific page was viewed at any point in the visit. Dashboard A customizable WebTrends report consisting of summary informationusually graphsfrom individual WebTrends reports in a profile, all grouped on one page. Dashboards provide a quick overview of key information for individuals, departments and specific roles. Data Source Splitter (DSS) A WebTrends feature allowing several profiles to use the same set of log files more efficiently rather than having to create separate profiles in the standard WebTrends manner. An organization with several virtual domains all served by the same set of web servers, and all logging to the same set of log files would be a candidate for using DSS. Another would be a hosting provider with several different domains logging to the same log files on the same servers. DSS allows an administrator to create profiles for each of the virtual domains, which splits the log files into smaller logs based on the domain names, so that domain-
Glossary
207
specific profiles can be run on the smaller logs. Destination Page A destination page is an administrator-specified page used in Destination Paths reports as the page to which all the analyzed paths lead. Dimension Elements or categories being reported on in a WebTrends report. A dimension usually does not have a numerical value; for example Pages and Content Groups. They are statistically described using Measureswhich do have a numeric valuesuch as visits, views, view time, etc. In WebTrends reports, the dimension is the first column or the first two columns if both a Primary and Secondary dimension are used. Dimensions are also presented in drill-down format in some WebTrends reports. Directory A web site is made of files that are usually grouped in buckets of similar files, such as all product pages, or all Human Resources pages. In a complex web site, buckets can contain smaller buckets, such as Human Resources procedures pages and Human Resources job listings, and the levels of buckets can go quite deep. The buckets, which may or may not have names that clearly indicate their contents, are called Directories. The smaller buckets within a bucket are called SubDirectories. This categorization is often reflected in the address of a web page, which includes not only the name of the page (joblistings.html), but also the series of buckets it belongs in separated by slashes (/international-companyinfo/USA-company-info/USA-human-resources/).
WebTrends uses the Directories concept two ways. First, it is possible to use a Directory to filter (exclude or include) page views by specifying directories to include or exclude. Second, a Directories report tallies the activity in individual directories.
DNS Lookup (Domain Name Service Lookup) The process of converting a numeric IP address into a text domain name. For example, DNS Lookup will convert the IP address 255.255.255.255 to the domain name YourDomain.com. DNS Lookup can be turned on and off by the WebTrends administrator. DNS refers to Domain Name Server. DNS Lookup is also called IP Resolution and Domain Name Lookup. Documents A legacy term referring to pages that were defined as documents by the system
208
administrator. Traditionally, a page is a document if the content is static, such as an HTML page. Domain Name The text name corresponding to the IP address of a computer on the Internet. For example, netiq.com is a domain name. A domain can be associated with many IP addresses but an IP address can have only one domain. Domain Type A broad categorization of domain names identified by the suffix, such as .edu (for domains related to educational institutions), .com (for domains related to commercial web sites), .org (for domains related to non-profit organizations), .gov (for domains related to governments), and many others. The domain type does not necessarily reflect the true nature of the web site, as domain suffixes are only loosely regulated, if at all. Drill Down In certain WebTrends reports, the drill-down feature allows the user to navigate from a highly summarized level of data to more detailed levels of data, organized along a concept hierarchy.
On a web site, drilling down is the act of going further down a branch of the site in search of more detailed information. Often, drilling down results in seeing a series of different navigation bars, each appropriate to its own level.
DSS See Data Source Splitter on page 207. Dynamic Page A page that is created by the web server from a template, or a general page structure, which is filled in with content pulled from a database. Servers build dynamic pages from particular components according to requests they receive from browsers.
The URLs of dynamic pages typically consist of the template name, followed by a question mark, followed by the content for the displayed page as a series of text strings separated by ampersands in the format parameter=parametervalue. For example, a page showing a blue Empire couch might be /product.asp?item=couch &type=Empire&color=blue. The parameters can be of great interest in web analytics, when shown as tabulated summaries of views of couches, Empire items,
Glossary
209
Entry
Entry File The first file requested in a visit. A visit has one and only one entry file. Files may be of any type, including a page file. Entry Page The first page requested in a visit. A visit has one and only one entry page. Note that a visit will have no pages if it doesnt include a page file. Entry-Exit Page A page view that is both the entry and the exit page; the only page in a SinglePage Visit. Exit Page The last page viewed in a visit. File A collection of information stored under a unique name, often in the form name.extension where the extension identifies the type of file and, usually implies what kind of program can open or view it. On the Web, common types of files are: page files (.htm, .asp, .jsp, .cfm, etc.), image files (.gif, .jpg, .png, etc.), applet files (.js, among others), non-page document files (.doc, .txt, .pdf, etc.), and style files (.css, among others). While a page file is technically different from a page (see Page on page 217), a page will always includes a page file.
File Type Corresponds to a files extension. For example, a file named graphic.gif is identified as type gif. Filter A setting in WebTrends that instructs the program to exclude or include (to the exclusion of all else) certain visits or hits from the analysis. In WebTrends, filters can be used individually or in groups, and individual filters can be combinations of different subparts.
First-Time Buyer A visitor who has made his or her first purchase. Also called New Buyer.
210
Forms
Scripted pages that pass variables back to the server. These pages are used to submit information entered by visitors in the forms fields.
Frequency The number of times a visitor has visited a site since tracking with persistent cookies and Visitor History began. Average Frequency is the average of the frequencies of all the visitors during the reporting period. Frequency is a retention metric and is part of RFM (recency, frequency, monetary) analysis. If visitors did not visit the site during the report time period, their frequency is not included. FTP File Transfer Protocol. A standard method of sending files from one computer to another over the Internet. A profile of increasing attrition that happens as site visitors go through a scenario, or a series of defined steps such as a purchase, an information hunt, or a registration on a web site. Because the number of people participating in each step is usually smaller than the step before, a graph of the declining participation, when mirrored, resembles a funnel.
Funnel
Geography Drilldown In certain WebTrends reports, a drill-down feature allows the user to navigate from a highly summarized level of data to successively more detailed levels of data, organized along a concept hierarchy. With geography drilldown, users can examine activity by areas of visitor origination, for example, viewing visits, page views, revenue, or average order size, or viewing by Region, Country, State/ Province, or City. GeoTrends Database The optional GeoTrends Database resolves IP addresses of visitors into more meaningful data such as the region, country, state/province, city, area code, designated marketing area, metropolitan statistical area, and time zone data corresponding to the location of the owner of a specific domain name. In the specific case of AOL IPs, location is resolved to geographic regions served by AOL as opposed to the location of AOL in the state of Virginia. GeoTrends Database replaces the older WebTrends Company Database.
Glossary
211
GIF
A graphics file format and file extension (*.gif) commonly used on web pages, referring to Graphics Interchange Format. A request for a file by a browser. Since file refers to images, styles, and many other elements besides .html pages, a single web site page view can involve dozens of hits. Because the number of hits is so heavily influenced by the complexity of a page, hits are a far less helpful measure of site traffic than visits or visitors. The hits statistic is somewhat useful in assessing the load experienced by a web server.
WebTrends SmartSource Tags do not capture hit-level data.
Hit
Homepage The main or introductory page of a web site, usually designed with the expectation that it is the first page a visitor sees. It is also the default page that is sent in response to a request containing only the domain name. Homepage URL The URL for the homepage of the site analyzed in the report. The homepage URL is specified during WebTrends setup in order to help WebTrends consolidate hits to several versions of the homepage, for example, flash- and non-flashversions or framed and frameless versions. HTML The abbreviation for Hypertext Markup Language, which is used to format text files so that web browsers can display text with appropriate hyperlinks, font sizes, and other text formatting. The abbreviation for Hypertext Transfer Protocol, a standard method of transferring data between a web server and a web browser. It is the text string that appears at the beginning of web addresses, and it informs a browser that the request is for a web page as opposed to an FTP site or another type of browser destination.
HTTP
Instrumented Web Page A web page that contains a WebTrends SmartSource Tag. The SmartSource Tag does two things. First, it transmits traffic data (similar to that in a standard IIS or Solaris log) to the WebTrends SmartSource Data Collector for processing into
212
reports. Second, if set up to do so, it also collects and transmits a wide variety of optional extra data to the same Data Collector. IP Address A numeric phrase used to identify a computer connected to the Internet. IP addresses consist of four one-to-three-digit numbers separated by periods, for example, 212.6.125.76. WebTrends allows filtering activity coming from a specific IP address or range of addresses. JavaScript Tag A script (JavaScript or sometimes VBScript) that can be added to the code of a web page to capture information about a visit to that web page (for example, IP of visitor, time of day, name of page, parameters, etc.) and send it to a data collection server such as WebTrends SmartSource Data Collector. JPEG An abbreviation for Joint Photographic Expert Group, referring to a compressed graphics format common on the Internet. Also called JPG. Navigation or moving from one page to another using a link.
Jump
Landing Page A page on a web sitewhich may or may not be the home pagewhere the visitor arrives. For example, in an email campaign, you would use a landing page as the page to which the email directs the prospect via a link. Latency The average number of days between visits for a given visitor since tracking with persistent cookies and Visitor History began; for example, those who visit on average every 7 days. For a given visitor, a lapse of 12 days between the first and second visit, and a lapse of 24 days between the second and third visit, equals a latency of 18 days. Note that a zero latency means the average time between visits is less than 24 hours. If visitors did not visit the site during the report time period, their latency is not included.) Lifetime Value The total monetary value of a visitors past orders since tracking with persistent cookies and Visitor History began. Average Lifetime Value is the average of all the Lifetime Values of the visitors who visit the site during a reporting period. If
Glossary
213
visitors did not visit the site during the report time period, their Lifetime Value is not included. Link On a web page, text or an image that has been coded to take a browser from one page to another, or from one site to another.
Log File A file on a web server that contains records of activity related to requests for site content from browsers, spiders, and other outside entities. Log File URL The full address, including network ID, drive and directories, of the web server log files that are to be analyzed in a profile. Loyal Visitor A visitor who visits a site relatively frequently. LTV Same as Lifetime Value; see page 213.
Marketing Campaign A specific effort to attract visitors to your site. It may be one individual ad or a coordinated set of ads treated as one entity for reporting purposes. In the web world, marketing campaigns usually consist of e-mails, graphics on another site or on a wireless interactive appliance, and traditional media such as direct mail, print, broadcast, outdoor advertising, etc. In WebTrends, campaigns are set up by the reporting administrator with a unique URL/landing page, a starting date, an ending date, and a cost. Same as Campaign and Ad Campaign. Mean A statistical term referring to sum of a measure divided by the number of items measured. Also called the average. For example, for a series of 11 visits consisting of 3, 7, 7, 7, 8, 10, 15, 22, 25, 25, and 35 page views each, the mean number of page views is 14.9 (total 164 divided by 11), the median is 10 (the 6th in the series of 11) and the mode is 7.
Measures Quantities being reported on in a WebTrends report. Measures are quantitative in nature and appear in WebTrends reports as columns to the right of the Dimension column(s), statistically describing them. In Custom Reports, the
214
WebTrends administrator can define and use a wide variety of Measures. Median A statistic used as an alternative to Average. In a collection of numbers that have been ordered by size, the Median is the middle value. It is smaller than exactly half of the numbers and larger than the other half of the numbers. The Median is less distorted by extreme numbers than is the Average. For example, for a series of 11 visits consisting of 3, 7, 7, 7, 8, 10, 15, 22, 25, 25, and 35 page views each, the median is 10 in this series (the 6th in the series of 11). The average is 14.9 and the mode is 7. For an even numbered series, such as 12 visits, the median is the average of the middle two numbers. A statistic used as an alternative to Average. In a collection of numbers, it is the number that appears most often. For example, for a series of 11 visits consisting of 3, 7, 7, 7, 8, 10, 15, 22, 25, 25, and 35 page view each, the mode is 7. The median is 10 in this series (the 6th in the series of 11), and the average is 14.9.
Mode
Monetary Value The total value of a visitors past orders or transactions since tracking with persistent cookies and Visitor History began. Same as Lifetime Value. Average Monetary Value is the average of all the Lifetime Values of the visitors during a reporting period. If visitors did not visit the site during the report time period, their Monetary Value is not included. Most Recent Campaign The last campaign that a visitor responded to since tracking with persistent cookies and Visitor History began. For the report time period selected, all conversions and other activity are tracked and attributed to visitors most recent campaigns. Only those most recent campaigns whose durations have not expired are included, and the report administrator sets this expiration. Thus, even if the conversion does not happen on the first visit generated by the most recent campaign, the appropriate source is credited with the conversion. If visitors do not visit the site during the report time period, their most recent campaign is not included. Multi-Homed Domain The domain name or IP address of one of the sites in multi-homed log file. You can report on a single domain using the Multi-Homed Domain Filter.
Glossary
215
Multi-Homed Log File A single log file that contains the access information for multiple web sites. To specify which domains are analyzed in this type of file, use the Multi-homed Domain Filter. Multi-homed Web Server A single server that hosts more than one web site. Multi-Page Visit A visit in which more than one page was viewed. In other words, any visit that is not a single-page visit. Navigation The act of moving from location to location within a web site, or between web sites, accomplished by clicking on links. Navigation also can refer to the overall structure of the links on the site, comprising the paths available to the visitor. New Visitor A visitor who has never been to the site since tracking with WebTrends and persistent cookies began.
New visitors are identifiable only on sites that give out persistent cookies. WebTrends identifies visitors as new visitors if they have no site cookie when they arrive, and they are able to accept a cookie for their subsequent page views. If they already have a site cookie when they arrive, they must have been to the site before. In a log file, a new visitors first page view has no cookie, but all other page views do. Its important to realize that never been to the site before can be evaluated only for the time period during which the persistent cookie has been given out. In fact, when a persistent cookie is first implemented, all visitors appear to be first-time visitors. Visitors whose browsers do not accept cookies appear as unknown in reports that display new and returning visitors.
No Referrer A line item in the Referrers reports that pertains to visits that have no known referring site, domain, or URL. Usually, this means that visitors arrived at your site by typing the URL of your site into their browser address window, they used a bookmark, or they clicked on a link in an e-mail. If No Referrer is the only line in a Referrers report, this usually means the Referrer field is not used in your traffic logging.
216
Order
Order Count The number of completed purchases. Order Quantity The number of items purchased in an individual order. Order Value The monetary amount of an order. Organic Search Phrase A search phrase for which your site shows up on result pages, because of the search engines method of ranking pages as opposed to paid placement. Other This is a term appearing at the bottom of WebTrends report tables for any table that spans several pages. In these situations, other refers to table line items that appear on the other pages of the table, whether before or after the portion of the table being viewed. WebTrends uses the other quantity to indicate the proportion of the total picture that is the viewable part of the list.
Paid Search Phrase A search phrase for which your site shows up on result pages due to paid placement with the search engine as opposed to its method of ranking pages (Organic). Page Same as web page. In terms of a web site visitors experience, a page is a unit of site content, often resembling a paper page of indefinite length and width, that has a single URL address. What the visitor sees as a page is usually a collection of files, always including one page file (.htm, .jsp, .asp, .cfm , etc.), plus, depending on the page, image files (.gif, .jpg, .png, etc.), style files (.css, among others), applet files (.js, among others), and a variety of other types of files. In WebTrends default settings, a page is technically defined as a file with the following extensions: .htm, .asp, .jsp, .cfm, etc. This technical definition can be modified by the administrator to include or exclude any file extension.
Page View Technically, a page that is displayed by a browser. This term is often used loosely
Glossary
217
to also include page files that are delivered to a browser, whether or not they are displayed on the screen. An example of a Page View that is not actually displayed is a Redirect Page. Palm Browser A program used on a Palm device to display site content, similar to Netscape or Internet Explorer on PCs. Palm Device A portable personal computer small enough to fit in the palm of a persons hand, specifically those made by the company Palm and using the Palm operating system. Parameter Parameters are located in the URL immediately after a question mark and are followed by an equal sign and a return value, known as name=value pairs. For example in the following URL, (/products/furniture.asp?cart_id=445& product=couch), there are two parameters: cart_id is the name and 445 is the value, and product is the name and couch is the value. When URLs contain more than one parameter value name=value pairs are separated by the & symbol. Parent-Child Profiles A specialized way of setting up profiles for different web sites that share servers and log files. Setting up a Parent-Child arrangement automates the creation of profiles and reports on a number of domains or subdomains from a single log file. New domains or subdomains automatically generate new profiles. Path The sequence of all pages viewed during a visit, or any portion of that sequence. In WebTrends reports, paths either have a designated starting point (the visit entry page or a designated path start page) or a designated end point (destination page); or, paths are Top Paths, which, regardless of specific start page or end point, are common routes through the site. Technically, any visit contains many paths, each consisting of two or more sequential page views. Paths can also refer to content group paths instead of paths consisting of individual pages.
The length of paths tracked is either determined by the number of pages viewed, or by the path analysis length limit if the number of pages viewed is greater than the limit.
218
Path Analysis A report displaying and quantifying paths that fit the criteria set up by the WebTrends administrator including a starting point or an ending point (destination), and a path analysis length limit. Path of Interest Describes a concept and practice of focusing path analyses on a particular area of interest. With WebTrends this is typically done with Destination Paths and Paths From Starting Page reports, though technically Top Paths and Paths From Entry are also paths of interest. Percent Change In a comparative date range display, a positive or negative percentage that indicates the size of the increase or decrease between the first and second date range. A value of 100% indicates that the second date ranges value is twice that of the first date ranges value; that is, 100% more than the first value. Percent change is calculated by subtracting the first date ranges value from the second date ranges value and dividing the result by the value of the first. Persistent Cookie A cookie that lasts longer than the duration of a visit and is saved in the Cookie folder of a browsers computer. It is used by WebTrends to distinguish new from returning visitors among other things. Platform The operating system, such as Linux or Windows, used by the visitors computer. Product A specific good or service that is sold or displayed on a web site. Product Group This is the highest-level categorization of products used in product drilldowns, for example Electronics. The WebTrends administrator defines levels used in the categorization scheme to allow reporting on groups of products in a way that is meaningful to the report users. Profile This is a collection of WebTrends report settings and definitions used to generate, analyze and distribute the set of reports. It is integral to producing WebTrends reports. The characteristics of a Profile include the location of the
Glossary
219
log files and specific information about their content that will be used in analysis, such as which page URLs are to be assigned to Content Groups and which page URLs are to be starting pages for path analysis. When specified in conjunction with a Template, the Profile determines a complete report configuration that can be analyzed. A Profile can have several templates, just as a template can be applied to many Profiles. A web site can have one or many Profiles and templates. Protocol An established method of exchanging data over the Internet. Psychographics Used to build customer segments based on attitudes, values, beliefs and opinions as opposed to the factual characteristics of demographics. Political views, learning patterns or music tastes would qualify for psychographic segmentation. Marketing research usually combines demographic and psychographic information to build a more comprehensive understanding of customers.
Because the Internet is still a relatively new and evolving medium, one which the mass market is still getting used to and whose usage patterns are determined both by levels of Web experience and type of person, psychographics are of great interest for the Web. The ability of an online broker to convert browsers to online traders, for example, will depend to a large degree on the type of person using the site: are they confident people who like to give things a go or are they risk-averse followers of the masses? Psychographic segments built on attitudinal and behavioral characteristics will often be good indicators of how customers will use and react to a web site.
Purchase A completed transaction involving an exchange of money for a product, service, privilege, or other item. Purchase Conversion Funnel A specific kind of scenario analysis consisting of steps leading to online purchases. The steps of the scenario are designated by the WebTrends administrator. Query Parameter An individual piece of a query string consisting of a parameter name and a value for the parameter.
220
Query String The part of a URL that contains information about the content of a dynamically generated page. Web servers use this information to retrieve the specified content from a database and combine it with a template to display a page. A Query String can also contain information that is not directly used to construct a page, but which is intended for use in reporting or other functions. WebTrends SmartSource SDC tagging is often used to insert valuable reporting information into the query string. In many dynamic URLs, the Query String is the part of the URL that follows a question mark. Recency The number of days since a visitors most recent visit since tracking with persistent cookies and Visitor History began. Zero recency refers to a visit in the preceding 24 hours. Average Recency is the average of the recency of all visitors during the reporting period. If visitors did not visit the site during the report time period, their Recency is not included. Redirect Page A web page that is coded to take the visitors browser to another page automatically and usually immediately. Many redirects are instantaneous and the visitor does not see the redirect page. Some have time delays and allow the visitor to see the redirect page for a certain number of seconds. Redirects are used to help track clicks that go off site, or to an executable, downloadable, or other file that cannot normally be logged. Referrer A web domain, site, or page that contains a link to one of your site pages that was used by a visitor to get to your site. Referring Domain A web domain that contains a link to one of your site pages, used by a visitor to get to your site. For example, yahoo.com. Referring URL The URL of a specific page on a site that contains a link to one of your site pages that was used by a visitor to get to your site. Registration Conversion Funnel A specific kind of scenario analysis comprised of steps leading to online registration. The word funnel refers to the typical attrition of visitors from one step
Glossary
221
to the next. The steps of the scenario are designated by the WebTrends administrator. Repeat Buyers Visitors who bought something during the reporting period and are known to have bought something previously as well. Use persistent cookies to track Repeat Buyers. If buyers have cookie parameters for purchases from your site dating from their purchases during the reporting period, they are repeat buyers. Visitors whose browsers do not accept cookies appear as unknown in reports that display first-time vs. repeat buyers. Returning Visitors Visitors who have been to your site before. Returning visitors are identifiable only on sites that give out persistent cookies. WebTrends identifies visitors as repeat visitors if they have a cookie from your site dating from before their first visit during the reporting period. Visitors whose browsers do not accept cookies appear as unknown in reports that display new and repeat visitors. Report A term loosely applied to graphs and a table associated with an individual analysis, or the collection of all such reports resulting from the analysis of a given profile and template.
Report Period, Reporting Period The dates covered by the data displayed in a report. WebTrends users may select a report period of any day, week, month, quarter, or year, or a custom date range and can switch between date ranges as desired. Report Templates A set of report characteristics consisting of content, the contents order of appearance, graphic type specification, style, format, language, and other settings which determine the form and content of a finished report. A given profile can have many templates assigned to it, and the report user can view different templates depending on permissions in place. Likewise, a given template can be assigned to many different profiles. Request A signal from a browser to a server that asks the server to send a specific file to the browser. The request, plus some details about the servers response to the request, is recorded as a line in a log file. Although GET in a log file is usually
222
thought of as a request, both POST and GET methods are requests. Resolve With respect to IP addresses, indicates success in identifying and displaying a text domain name for a numeric IP address.
Retention How well a site draws visitors back for more visits.
Alternatively, a measure of the effectiveness of a source of visitors (a campaign, a search engine, individual keywords on a search engine, an affiliate site, etc.) measured in terms of Recency and Frequency of visitors who were originally introduced to the site by that source.
Return Code A code in the status field of a log file that identifies the success, failure, and other characteristics of a transfer of data from a server to a browser. Also called Status Code. See Status Code page 227 entry for a full list of all error codes. Returning Visitors Visitors who have been to your site before.
Returning visitors are identifiable only on sites that give out persistent cookies. WebTrends identifies visitors as returning visitors if they have a cookie from your site dating from before their first visit during the reporting period. Visitors whose browsers do not accept cookies appear as unknown in reports that display new and returning visitors.
Reverse Path A path that ends at a designated page, called the destination page in WebTrends reports. Reverse indicates backing up from a certain page to examine how visitors arrived there. RFM A group of measures, made up of Recency, Frequency, and Monetary Value, which are useful for segmenting customers for marketing purposes. RFM analysis is a marketing technique used to determine quantitatively which customers are the best ones by examining how recently a customer has purchased (recency), how often they purchase (frequency), and how much the customer spends (monetary value). Requires use of persistent cookies and Visitor History. If visitors did not visit the site during the report time period,
Glossary
223
their RFM is not included. Scenario A series of two or more pages on a web site that can be treated as a kind of process or logical sequence, such as the process of making a purchase (the checkout process), the process of signing up for a newsletter (the signup or registration process), the process of using a gift finder, and so on. While a scenario by definition has a series of ordered steps, it is possible for visitors to start processes mid-scenario, such as a campaign that directs visitors to step 2 of the scenario. New scenario visualization capabilities show visitor progress through scenarios, as well as the origin of visits entering scenarios midway and where visitors went after leaving the scenario. Scenarios are defined by the WebTrends administrator. Scenario Analysis A report showing the amount of activity at each step of a defined scenario, plus conversion rates for each transition from step to step as well as for the whole process. Examples of scenarios are check-out, registration, or application sequences. New scenario visualization capabilities show visitor progress through scenarios, as well as the origin of visits entering scenarios midway and where visitors went after leaving the scenario. Scenario Conversion Rate The percentage of scenarios started in relation to those that were completed. Script A simple programming language used to execute tasks. Scripts are often used for pages on the Internet to serve dynamic content and to tailor pages for individual visitors.
Search Engine Keywords A single word within a search phrase, or a search word used by itself. In the phrase cordless phone the individual keywords are cordless and phone. Also called search keyword. Search Engine Phrase All the words used in a search. In the phrase cordless phone the phrase is cordless phone, and in the search phone the phrase is phone. Also called search phrase.
224
Search Engine A web site that enables users to search for web pages throughout the Internet by entering keywords. Search Engine Marketing The art and science of increasing a web sites visibility and traffic by being listed favorably on search engines for a defined set of keywords and phrases through paid and optimization tactics. Search Engine Optimization The art and science of optimizing your web site to improve the natural listing or ranking your site receives from search engines for certain keywords and phrases. Often referred to as SEO. Server A computer that stores a web site and interacts with browsers to send (serve) web pages and other files associated with the web site.
Server Errors A server error occurs at the web server and receives an error code in the 500 range. Below are examples of some of the most commonly experienced server errors:
500 Internal Server Error 501 Not Implemented 502 Bad Gateway 503 Service Unavailable 504 Gateway Time-out 505 HTTP Version Not Supported
Session, Sessionize, Sessionization The process of dividing and ordering a list of page views and events in a sites log into visits or sessions, where each visit includes the sequence of pages viewed by a visitor during a specified time period. Shopping Cart A part of a shopping web site where visitors can park items they have selected, presumably for eventual purchase.
Glossary
225
Single Access Page In WebTrends 6.x and before, a visit that consists of only one page view. In WebTrends 7.x and after, these are called Single-page Visits. Single-page Visit A visit that consists of only one page view. In Single-page Visits, the page viewed is counted in at least three WebTrends reports: Single-page Visits, Entry Pages, and Exit Pages. SmartSource A trademarked technology from WebTrends. SmartSource Data Management offers an alternative to traditional web server log file analysis, collecting information directly from the visitors' browser (the client) rather than from server log files, improving data accuracy. Special script in a pages source code is used to transmit page-level data, not hit-level data, to a data collection server dramatically reducing data volume and decreasing processing time. Advantages of using SmartSource include capturing page views resulting from back button use, views of cached pages, and the opportunity to collect extra, customized data not included in normal web server log files. SmartSource Data Collector (SDC) A specialized web server application, proprietary to WebTrends that acts as the recipient and organizer of data transmitted from web pages by WebTrends SmartSource Tags. The SmartSource Data Collector also validates and generates cookies and delivers a .gif file as part of the data collection process. SmartSource Parameter WebTrends SmartSource SDC tagging is often used to insert valuable reporting information into the query string of URLs. This is done through SmartSource Parameters, which consist of name=value pairs. SmartSource Tags A WebTrends script (JavaScript or VBScript) that can be added to the code of a web page to capture information about a visit to that web page (for example, IP of visitor, time of day, name of page, parameters, etc.) and send it to a data collection server such as WebTrends SmartSource Data Collector. The code is executed when the page is loaded into a browser. Spider An automated program that crawls widely through the Internet and collects and
226
indexes information, usually on behalf of a search engine or a monitoring company. A spider can often by identified through the User Agent field of a log file, or through its IP address. Status Code A code in the status field of a log file that identifies the success, failure, and other characteristics of a transfer of data from a server to a browser. Also called Return Code. 100 = Success: Continue 101 = Success: Switching Protocols 200 = Success: OK 201 = Success: Created 202 = Success: Accepted 203 = Success: Non-Authoritative Information 204 = Success: No Content 205 = Success: Reset Content 206 = Success: Partial Content 300 = Success: Multiple Choices 301 = Success: Moved Permanently 302 = Success: Found 303 = Success: See Other 304 = Success: Not Modified 305 = Success: Use Proxy 307 = Success: Temporary Redirect 400 = Failed: Bad Request 401 = Failed: Unauthorized 402 = Failed: Payment Required 403 = Failed: Forbidden 404 = Failed: Not Found 405 = Failed: Method Not Allowed 406 = Failed: Not Acceptable 407 = Failed: Proxy Authentication Required 408 = Failed: Request Time-out 409 = Failed: Conflict 410 = Failed: Gone 411 = Failed: Length Required 412 = Failed: Precondition Failed 413 = Failed: Request Entity Too Large
Glossary
227
Stem
414 = Failed: Request-URI Too Large 415 = Failed: Unsupported Media Type 416 = Failed: Requested range not satisfiable 417 = Failed: Expectation Failed 500 = Failed: Internal Server Error 501 = Failed: Not Implemented 502 = Failed: Bad Gateway 503 = Failed: Service Unavailable 504 = Failed: Gateway Time-out 505 = Failed: HTTP Version Not Supported
The part of a dynamic URL that is the template. It is usually the part of the URL before the question mark that separates the template from the parameters. Same as URL Stem Field. In path analysis, each page view in the path is a step. In Scenario Analysis, each page in the Scenario is a step.
Step
Subtotal In WebTrends report tables, this usually refers to the total for just the line items appearing in the part of the table on one report page, i.e., that can be seen by scrolling but not by clicking on a forward or back button. If a table spans several pages, each pages portion of the table will have its own subtotal. Statistics for parts of the table not shown on the current page will appear as Other. Suffix (Domain Name) The three digit suffix of a domain name can be used to identify the type of organization to which the web site belongs. For example, the suffix .edu implies that the organization associated with the site is an educational organization. Table In WebTrends, a matrix or tabular array of results. Each report usually contains one or more graphs and a table. A table may be broken up to span several pages, or it may fit on one page. A script (JavaScript or VBScript) that can be added to the code of a web page to capture information about a visit to that web page (for example, IP of visitor,
Tag
228
time of day, name of page, parameters, etc.) and send it to a data collection server such as WebTrends SmartSource Data Collector. WebTrends proprietary tag is called the SmartSource Tag. Target Page When a redirect page is used, the target page is the page to which the visitors browser is sent. The term can also refer to the web page that is the destination of a hyperlink. Template A collection of WebTrends settings that has a unique name and defines the content and appearance (language, style) of reports to which it is applied. When specified in conjunction with a profile, it determines a complete report configuration that can then be analyzed. In many cases, a given template can be applied to any profile, and a given profile can have many templates. A template allows you to automate and easily customize the content on the WebTrends Desktop for a specific business function or user. Templates give administrators and users the ability to customize their views, as well as assign dashboards, reports and language preferences to a given template. Time to Serve The time it takes to serve up a web page to a visitor, measured in milliseconds. Top The pages from which most users enter the site or leave the site. Can be distorted by non-human traffic (for example, spiders and robots). Useful to see if lots of people are following a particular link out of the site or whether visitors appear to have a bookmarked page other than the homepage.
Top-Level Domain The suffix of a domain name. A top-level domain can be based on the type of organization (.com, .edu, .gov, .name, etc.) or it can be a country code (.uk, .de, .jp, .us, etc.). The top-level domain can be used to identify the type of web site. Traffic In general terms, the number of visits, visitors, or activity on a web site.
Translation Files Comma separated value files (.csv) used to convert analysis information into more helpful report data. Their uses include creating more readable reports and
Glossary
229
providing drilldown analysis for campaigns and products. They can translate a captured value into another single value or, when using drilldown capabilities, into multiple values that all pertain to the original value. Unique Visitors Number of unique individuals who visited your site during the report period, as identified by a persistent cookie. If someone visits more than once during the report period, they are counted only as one unique visitor. Unique visitors may not perfectly match the number of unique individuals visiting the site, because someone may visit a site from more than one computer and have a different cookie at each computer, or people may share the same computer to access the same web site. Unknown Unknown is a possible line item in several WebTrends reports. In geographyrelated and organization-related reports, unknown origin means WebTrends was unsuccessful in looking up an IP address or domain name. In first-time versus repeat visitor and buyer reports, it refers to visitors whose browsers did not accept cookies. In repeat visitor reports where all visitors appear as unknown, then the site does not issue persistent cookies. URL Uniform Resource Locator. It is a means of identifying an exact location on the Internet. For example, http://www.webtrends.com/html/info/default.htm is the URL which defines the location of the page Default.htm in the /html/info/ directory on the NetIQ Corporation web site. As the previous example shows, a URL consists of four parts: Protocol Type (HTTP), Machine Name (webtrends.com), Directory Path (/html/info/), and File Name (default.htm).
URL Query String The portion of the URL that contains query parameters. URL Stem Field The part of a dynamic URL that is the template. It is usually the part of the URL before the question mark that separates the template from the parameters. Same as Stem. User Agent Portion of a log file that identifies the browser and platform used by a visitor. Also identified through Tags.
230
VBScript Tag A script (VBScript or sometimes JavaScript) that can be added to the code of a web page to capture information about a visit to that web page (such as IP of visitor, time of day, parameters) and send it to a data collection server such as WebTrends SmartSource Data Collector. Visit All the activity, of one visitors browser to a web site, within certain time constraints. A visit is a series of page views, beginning when a visitors browser requests the first page from the server, and ending when the visitor leaves the site or remains idle beyond the idle-time limit. A person at a computer using a browser to visit a web site. A visitor may make more than one visit during a given time period. Note the combination of person, computer, and browser. Since a person may use different computers or even use different browsers on the same computer, it is possible for him/her to appear as more than one visitor because the chief means of distinguishing a visitor is through a persistent cookie or, less desirably, the combination of IP address and platform/browser details.
Visitor
Visitor History Visitor History is a feature in WebTrends, which when activated, records specific information about the history of your visitors including how often they have visited your site (frequency), how recently theyve visited (recency), the number of days between their visits (latency), the value of all their purchases (lifetime value), the campaign that generated their first visit to your site, the search engine phrase used most recently to visit your site, and much, much more. Many reports depend on Visitor History being activated, such as any of the Buyers by reports. The Visitor History table has four categories of information it captures, each of which offers a variety of different measurements and possible report combinations that allow visitor segmentation, including: Visit Attributes, Campaign Attributes, Purchase History, and Visitor Firsts. Also, Purchase History can measure any form of conversion the WebTrends administrator defines, not just sales. Persistent cookies are used to recognize unique visitors and to record Visitor History events, which are only associated with this unique IDnot specific, known individuals. With all Visitor History measures and reports, a visitor must
Glossary
231
have visited the site during the report time period in order for their Visitor History data (data which may be outside the report time period) to be included in the report. Visitor Session A full time period a visitor spends at a particular site. As soon as there is 30 minutes (definable within WebTrends) of inactivity, the session is closed. WAP Wireless Application Protocol.
WAP Browser A program used on a WAP device to display site content, similar to Netscape or Internet Explorer on PCs. WAP Carrier A server that acts as an intermediary and relays requests from visitors with WAP devices to your site. WAP Device A wireless device using Wireless Application Protocol (WAP), such as a cellular telephone or radio transceiver, that can be used to access the Internet. WebTrends software reports only include WAP devices if the web data activity file shows the device used a WAP browser. WebTrends Data Warehouse The WebTrends Data Warehouse (formerly called the Webhouse Builder) transforms raw web data activity files into a normalized format which can later be used by web traffic analysis profiles for analysis and reporting. Without the WebTrends Data Warehouse, large logs files must typically be stored on a separate machine accessed through a mapped drive, which makes the speed of the analysis dependent on the speed of the network connection. Additionally raw web data activity files are just that, unprocessed, and in their original state. Web data activity files that have been imported and stored using the WebTrends Data Warehouse have already been parsed, normalized, processed, and possibly even filtered, making reporting time for large logs significantly shorter.
232
Well-known Parameter Specially named URL parameters that work specifically with the WebTrends Auto-configuration feature. These parameters are created and transmitted by SmartSource Tags or using WebTrends Script, and are recognized by WebTrends to allow automatic generation of reports based on those parameters, without the need for configuration on the part of the WebTrends administrator. For example, parameters can be used to assign a page to certain Content Groups, Scenarios, or to insert data into Visitor History Tables as first campaign or other attributes. WTLS Acronym for Wireless Transport Layer Security protocol, which is the security layer endorsed by the WAP Forum (www.wapforum.org). Its primary goal is to provide privacy, data integrity, and authentication for WAP applications.
Zero-page Visit A visit that included no page views. This is possible if a visit consisted of at least one request for a non-page file (such as a graphic), but no page files (such as .htm, .asp, .jsp, or .cfm).
Glossary
233
234
Index
authenticated username filter 103 authentication 202 average frequency 203 average latency 203 average lifetime value 203 average recency 203 average, statistical term 202
B A
A/B testing 19 abandonment rate 201 Accessed File Types report 100 acknowledgement page 201 acquisition 201 email marketing 134 referrers 123 acquisition metrics 119 Activity by Referring Site report 124 Activity by Search Engine report 132 activity, web 201 ad 201 ad campaign 201 Ad Click 85, 202 Ad Clicks filter 102 Ad View 85, 202 Ad Views filter 102 address filter 99 Address, web 202 Advertising Views 85 aggregate 202 archiving 189 authenticated user 202 authenticated username identifying visitors 68 banner, banner ad 203 behavior segmentation 160 bookmark 203 branding web sites 37 browser 203 browser filter 98 business goals 28 business metrics 30
C
caching files from an FTP server 192 caching uncompressed web data activity files campaign 203 campaign creative 204 campaign drilldown 204 campaign filter 106 campaign ID 204 campaign type 204 Campaigns report 135 checkout page 204 child profile 205 click 205 clickstream analysis 142 click-through-rate 205 client 205 client errors 205
191
235
client-side data collection 205 client-side tagging 49 benefits 51 drawbacks 52 collecting web activity data 41 combined log file format 206 commerce web sites 28, 33 complete path 142 consulting with WebTrends 17 content group 206 content group path 144 content groups 77 Content Groups report 79 content path 206 content web sites 32 conversion metrics 139 cost 140 conversion, conversion rate 206 cookie expiration 67 cookie filter 97 cookies 64, 206 pitfalls 65 corporate portal web site 30 cost of conversion metrics 140 critical metrics 27 CRM 173 custom reports 112 customer databases 173 customer relationship management 173 customer retention 24 customer self-service web site 29
data collection worksheet 54 data exploration 171 data farming 171 data integration 171 data record, sample 59 Data Source Splitter (DSS) 207 data storage issues 187 data tagging 49 benefits 51 drawbacks 52 day of the week filter 102 dead-end paths 154 defining behaviors worksheet 92 demographic data 154, 173 destination page 208 dimension 208 directory 208 directory filter 101 DNS (Domain Name Service) 62 DNS Lookup 208 documents 208 domain names 209 pitfalls 63 visitor identification 62 domain type 209 drill down 209 drill down capability 177 DSS 209 dynamic page 209 dynamic pages URL rebuilding 87 dynamic web page 76
D
dashboard 207 data aggregation 109 data collection methods 41 choosing 53
E
email campaigns, tracking multiple 135 email marketing and acquisition 134 embedded IDs 67
entertainment web site 29 entry file 210 entry page filter 104 entry pages 120, 210 Entry Pages report 121 Excel 171 Excels PivotTable function 171 exclude filters 94 exit pages 152, 210 Exit Pages report 152 exit ratio analysis 152 external databases 172
referrer 105 requested URL 96 return codes 99 visit 95 first-time buyer 210 first-time vs repeat visitors 139 focused path 142 forms 211 frequency 161, 211 FTP 211 funnel 211
F
file 210 types 89, 210 file filter 100 filtering data 93 filtering worksheet 117 filters 210 Ad Views 102 address 99 authenticated username 103 browser 98 campaign 106 clicks 102 cookie 97 day of the week 102 directory 101 entry page 104 exclude 94 file 100 hit 95 hour of the day 102 HTTP method 97 include 94 multi-homed domain 98
G
geography drilldown 211 GeoTrends 173 GeoTrends database 211 GIF file 212
H
hit 212 defined 58 hit filter criteria 96 hit filters 95 Hits Trend report 102 home page 89 homepage 212 homepage URL 212 hosted solutions 52 hour of the day filter 102 HTML 212 page title lookups 194 HTTP 212 HTTP methods filter 97
237
I
identifying visitors 57 include filters 94 informational web site 28 instrumented web page 212 internal search 152 international leads, distribute 22 Internet resolution 193 Intranet web sites 30, 37 IP addresses 213 pitfalls 63 visitor identification 62
LTV 214
M
marketing campaign 214 mean, statistical term 214 Measurable Improvement Cycle 18 measures 214 media web site 29 median, statistical term 215 metrics acquisition 119 conversion 139 Microsoft Excel 171 mode 215 monetary value 215 most recent campaign 215 Most Recent Search Phrases report 133 multi-homed domain 215 multi-homed domain filter 98 multi-homed log file 216 multi-homed web server 216 multi-page visit 216 multiple filters 108 multiple login IDs 66 problems with 66
J
JavaScript tag 213 JPEG file 213 jump 213
L
landing pages 120, 213 latency 161, 213 lead-generation web sites 28, 34 lifetime value 162, 213 link 214 log entry, explained 44 log file rotation/rollover 187 log file sessions 60 log file URL 214 log files 42, 214 access 46 benefits 48 drawbacks 48 format 43 rotation 45 loyal visitor 214
N
navigation 216 navigation measurement 141 new visitor 216 New vs. Returning Visitors report 140 newsletter sign up 22 no referrer 125, 216 non-hosted solutions 52
O
objectives and critical metrics worksheet 39 On Demand Database (ODDB) 197 Onsite Ad Impressions report 85 optimizing worksheet 199 order 217 quantity 217 value 217 order count 217 other, report term 217
definition 94 protocol 220 proxy server buffers 63 psychographics 220 purchase 220 purchase conversion funnel 220 Purchase Conversion Funnel report 150
Q
query parameter 220 query string 221
P
page 217 page title lookups 194 page view 58, 217 paid search phrase 217 palm browser 218 palm device 218 parameter 218 parent-child profiles 115, 218 path 218 path analysis 142, 219 Path Analysis report 146 path of interest 219 percent change 219 performance issues 189, 196 persistent cookies 65, 219 physical data storage issues 187 PivotTable function (Excel) 171 platform 219 portal web site 29 product 219 Product Content Group Paths report 144 product groups 80, 219 Product report 81 profiles 219
R
recency 161, 221 redirect page 221 referrer 221 referrer filter 105 referring domain 221 referring site, domain, URL 125 referring URLs 221 and acquisition 123 registration conversion funnel 221 Registration Conversion Funnel report 83 registration information and demographic information 154 repeat buyers 222 report period, reporting period 222 report templates 222 reports 222 Accessed File Types 100 Activity by Referring Site 124 Activity by Search Engine 132 Campaigns 135 Content Groups 79 Entry Pages 121 Exit Pages 152
239
Hits Trend 102 Most Recent Search Phrases 133 New vs. Returning Visitors 140 Onsite Ad Impressions 85 Path Analysis Page 146 Product 81 Product Content Group Paths 144 Purchase Conversion Funnel 150 Registration Conversion Funnel 83 scheduling 196 storing 196 request 222 requested URL filter 96 resellers, finding 21 resolve 223 retention 223 retention metrics 159 return code 223 return code filter 99 returning visitors 222, 223 reverse path 223 RFM 223 rotation of log files 45 rotation/rollover 187
S
scenario 224 Scenario Analysis 83, 147 scenario analysis 224 scenario conversion rate 150, 224 scope of analysis, focusing 75 script 224 SDC 226 tags 49 search engine 22, 225 analysis 23 keywords 224
marketing 225 search engine optimization (SEO) 225 search engine phrase 224 segmentation 160 self-referring URLs 125 self-service web sites 36 server 225 server errors 225 session cookies 65 session ID 67 session, sessionize 225 sessionizing visits 59 sessions 59 shared key between two databases 174 shopping cart 225 process 148 scenario analysis 149 simultaneous analysis 196 single access page 226 single jump analysis 146 single-page visit 226 site objectives 27 site structure issues 87 SmartReports 176 SmartSource 226 tagging 49, 226 SmartSource Data Collector (SDC) 226 and cookies 65 and URL classification 77 SmartSource Parameter 226 SmartView 156 software solutions 52 spider programs 226 static web page 76 status code 227 stem 228 step (in a path) 228 storage issues 189
T
table 228 table filtering 110 table limiting 195 tag 228 tagging 49 benefits 51 drawbacks 52 target page 229 template 229 time stamp 61 time to serve 229 top pages 229 top-level domain 229 traffic 229 training with WebTrends 17 translation files 229
V
VBScript tag 231 visit 231 visit characterization worksheet 137, 158, visit filter criteria 104 visit filters 95 visit, defined 58 visitor 231 behavior 73 defined 58 goals 28 identification 57 identifiers 61 segmentation 160 visitor history 164, 231 visitor ID worksheet 72 visitor session 232 visitor summary 168 visitors worksheet 185 visit-to-exit ratio 153
169
U
unique visitors 59, 167, 230 unknown 230 URL 230 URL classification 75 Advertising Views 85 and SmartSource Data Collector (SDC) 77 content groups 77 example 76 product groups 80 scenario analysis 83 WebTrends methods 77 URL format 75 URL query string 230 URL rebuilding 87
W
WAP 232 WAP browser 232 WAP carrier 232 WAP device 232 warehouse reporting 175 web activity 201 collection methods 41 defining 57 web activity data collecting 41
241
web address 202 web analysis focus 28 web analysis introduction 13 web data activity files caching uncompressed 191 web data warehouse 172 reporting 175 web log worksheet 39 web page, dynamic 76 web server log files 42 web site branding oriented 37 business metrics 30 business models 31 commerce oriented 33 content oriented 32 goals 20 intranet oriented 37 lead-generation oriented 34 objectives 27, 28 objectives and critical metrics worksheet 39 self-service oriented 36 strategy 15 structure issues 87 web-customer intelligence 14 WebTrends consulting and training 17 WebTrends Data Warehouse 232 WebTrends Enterprise 52 WebTrends GeoTrends 173 WebTrends On Demand 52 WebTrends SmartReports 176 WebTrends SmartSource Data Collector (SDC) 49 WebTrends SmartView 156 well-known parameter 233 worksheet data collection 54 defining behaviors 92
filtering 117 objectives and critical metrics 39 optimizing 199 visit characterization 137, 158, 169 visitor ID 72 visitors 185 web log 39 WTLS 233
Z
zero-page visit 233