Vous êtes sur la page 1sur 3

Working with Large Datasets in VantagePoint

Overview:

Dave Schoeneck Senior Research Analyst, Search Technology, Inc.

When you are working with datasets with more than ~20,000 records in VantagePoint (VP), you may see an error message that you are running out of RAM. The following guidelines may help you to free up some system memory and continue to work. Which guidelines to apply will depend on your analytical needs and where you are in the workflow process. Strategies discussed in this document include: Use a 64-bit Operating System and Install the maximum amount of RAM supported by your computer Close other programs that are not essential to your analysis. Import a small number of fields at first; Use Import More Fields to add other fields later. Use VantagePoints Memory Manager tool

Use a 64-bit OS and Install the Maximum RAM VantagePoint is a 32-bit application, and is subject to the per-process memory usage limits of the operating system. These limits exist regardless of how much physical memory the computer has installed. If you are using VantagePoint on a 32-bit version of Windows, the maximum amount of memory that VP can use is 2 gigabytes. On a 64-bit Windows system, VantagePoint can use up to 3 GB. Close Non-essential Programs and *.vpt Files. If you have other applications running that are not essential to your workflow, close them to make more system memory available for VP to use. If you have more than one VantagePoint data file (*.vpt) open, close all open data files except the one in which you are currently working. Maintain a Dataset with as Few Fields as Possible When you maintain a dataset with only the essential fields, you also keep the size (in MB) of the *.vpt file on the disk as small as possible. This is especially important when you import raw data files, and it is advisable to import only the Title field at first, so you do not run out of memory before you save the *.vpt file to a disk. Once you have saved your dataset as a *.vpt file, exit and restart VP (to free up as much memory as possible) and open your saved dataset. You can use Import More Fields (from VPs Fields menu to add other fields you need after your data is imported and saved to a *.vpt file. Use discretion when choosing which fields to add. Whenever possible, avoid importing fields with Long Text (e.g. Patent Claims, Abstracts, etc.) Fields with a very large number of items will also consume a lot of system resources. Examples of such fields include: Fields with NLP Words or Phrases

Dave Schoeneck Senior Research Analyst, Search Technology, Inc. Cited References fields and fields derived from Cited References (e.g. Cited Authors or Cited Journals). Authors, Inventors, Full Organization Names, or fields with Uncontrolled vocabulary terms (see note below)

Note: Delete existing large fields that are not in use, but only if they can be readily imported again using Import More Fields. Use caution not to delete fields that have Groups you want to keep or Cleaned fields. Cleaned fields cannot be readily re-imported with Import More Fields, 1 (but the originating field on which the cleaning was done can usually be safely deleted.) Fields that include a lot of items also tend to have long tails on their record frequency distributions. That is, a vast majority of the terms will occur in only one or two records. When this is the case, consider creating a group of all terms that occur in at least N records. You can then use Create Field using Group Items to make a new field with far fewer items, and delete the originating, much larger field. Use VantagePoints Memory Manager Tool VantagePoint includes a Memory Manager tool, which you can use to unload fields from memory when they are not in use. You can use the Minimize Memory Use button after deleting fields, lists, matrices, or other types of sheets to make more system memory available. This feature is accessible only by a user-defined keyboard shortcut - instructions on setting up a hotkey are as follows: How To: Set up a Hotkey to Open VantagePoints Memory Manager Tool VantagePoints Memory Manager tool can be accessed only through use of a keyboard shortcut, which you will need to set up. The following illustrations walk through the steps to set a hotkey. Launch VantagePoint and select: ToolsEdit Keyboard Shortcuts
Figure 1

If you saved your List Cleanup work as a thesaurus, you can re-import the original field, and run your saved thesaurus on that field.

Dave Schoeneck Senior Research Analyst, Search Technology, Inc. Follow the steps in Figure 2 to configure the hotkey:
Figure 2

Press the hotkey combination to open the Memory Manager window. Fields can be unloaded from memory one by one or you can use the Minimize Memory Use button to unload all fields that are not loaded in detail windows or used by the currently viewed list, matrix, or other sheet.
Figure 3

Vous aimerez peut-être aussi