Glossary Item Box

NSHelp: Processing Time functionsProcessing Segmentation SettingsProcessing Extraction SettingsProcessing Summarization

See Also NeuroScript MovAlyzeR Help Send comments on this topic.

Word Extraction

Processing

Word Extraction Settings

Word extraction is an option to extract individual words when a trial consists of multiple words. Each word is stored in one sub-condition and processed. To set up word-extraction conditions for an experiment refer to the section on word extraction.

A sentence is tested for a new word every time there is pen-up. At this point, the following word extraction parameters are evaluated to decide whether a new word has started and whether the collection of sampling points since the previous pen-up or start of recording could be considered a new word.

 

Minimum width (cm): If the width/space of a word is more than this value, it is treated as a new word.

~ If you increase this value to 0.5 cm (for example), then if there is a word of length 0.6 cm and a consecutive word of length 0.4 cm, the second word is also considered to be a part of the first word.

Min horizontal leftward distance in cm between words (cm): If the distance corresponding to a pen-up duration within a line is more than this value, it is treated as a new word.

~ If you increase this value to 1 cm (for example), if there is word 1 and a consecutive word 2 at a distance of 0.5 cm, then the second word is also considered to be a part of the first word and new word is not recognized. In the case of a handwriting trial consisting of a sentence, it is recommended to have set this at a higher value and ask the user to leave some space between words.

Min horizontal leftward penlift in cm between words (cm): For a handwriting trial with words, there is a few seconds penlift when the user moves from one word to another. The program keeps track of the distance and time (Pen-up time) for which pen is moved, when it is lifted just above the tablet while recording. If the distance the pen is moved is greater than the value specified, then there is a new word.

Minimum downward distance between word beginnings (cm): If a sentence/line in a trial extends beyond a single line, then this value defines the distance in the downward direction after which the recording is treated as a new line.

Min duration of penlift in seconds between words (s) and Max duration of penlift in seconds within a word (s): If the pen-up time is lesser than this value no new word will be detected. If the pen-up time is greater than the minimum then there is a new word.

 

The Word Extraction Algorithm

The following sequence of tests is employed in determining if a new word has been found. The relevant parameters marked in blue can be set in Experiment Properties > Advanced > Word Extraction

Test 1: Horizontal size of penups between words > Min. leftward penlift movement between words

Test 2: Horizontal distance of pendown between words > Min. leftward distance between words

Not a new word if Width of word < Min. word width

Test 3: Large downward vertical movement to new line > Min. downward distance between word beginnings

Test 4: Large backward horizontal movement to new line > Min. word width

Test 5: Large penup time within a word > Min. duration of penlifts between words

Test 6: Initial penup segment is a new word : If there is an initial PenUp part in the data, that is a new word.

Not a new word if Penup time between words < Max. duration of penlifts within a word

 

How to set suitable Word Extraction parameters

Accuracy of the word extraction algorithm depends entirely on the values of the parameters set in Experiment Properties > Advanced > Word Extraction

To arrive at suitable values for these parameters, the following procedure is recommended:

1) Go to Experiment Properties > Advanced > Word Extraction and click on Reset to Defaults.

2) Reprocess the trial from which you want to extract the words. HWR files containing the individual extracted words will populate the tree.

3) Right click on the base trial from which the words were extracted > View Numerical Data > View Word Extraction Data. The Word Extraction Data gives information about what test and parameter the algorithm used in extracting a particular word.

4) Right click on the first extracted file > Chart Data > Chart Raw Data.

5) Visually inspect each extracted word for accuracy. Click the -> button to traverse through the trials.

6) If you find a word that is incorrectly extracted (the word is split between two trials), then look at the Word Extraction Data file and determine the test and parameter that were used to extract that particular word.

Note: If a words are grouped together when they should have been split into individual words, then decrease the values of all the Word Extraction parameters by half of their original default values and continue from there.

eg: In the Word Extraction Data file

Word 54: Samples(42821-42962) | Horizontal distance of pendown between words: PendownLeftx(11.295 cm) - PreviousPendownRightx(10.8685 cm) = 0.4265 cm >= MinLeftPendown(0.3 cm)

7) Go to Experiment Properties > Advanced > Word Extraction and change the Min. leftward distance between words from 0.3 cm to 0.5 cm

8) Reprocess the base trial and follow these steps until you are satisfied with the results of the Word Extraction algorithm.


NSHelp: Processing Time functionsProcessing Segmentation SettingsProcessing Extraction SettingsProcessing Summarization

 

 


© NeuroScript LLC. All Rights Reserved.