Pre-Presentation Notes
Slides and presentation materials are available online at:


karlwiegand.com/defense
1

Disambiguation of 
Imprecise User Input 
Through Intelligent 
Assistive Communication
Karl Wiegand
Northeastern University
Boston, MA USA
December 2014
2

Thesis Statement


"Intelligent interfaces can mitigate the need for linguistically and motorically precise user input to enhance the ease and efficiency of assistive communication."
3

Theoretical Contributions
"...mitigate the need for linguistically and motorically precise user input..."
An unordered language model that bridges syntax and semantics. [Wiegand and Patel, 2012A]
An empirical comparison of contextual language predictors. [Wiegand and Patel, 2015B (R1)]
A motor movement study with current and potential AAC users. [Wiegand and Patel, 2015A]
4

Applied Contributions
"...to enhance the ease and efficiency of assistive communication."
A semantic approach to icon-based, switch AAC. [Wiegand and Patel, 2014B]
A continuous motion overlay module for icon-based AAC. [Wiegand and Patel, 2012B]
Mobile, letter-based AAC that supports conversational speeds. [Wiegand and Patel, 2014A]
5

Outline
Assistive Communication
Theoretical Contributions
Applied Contributions
Summary and Conclusion
6

Part 1:

Assistive 
Communication
7

On Communication
SMCR and derivatives [Shannon and Weaver, 1949]
Affected by distortion to any component
What if there is distortion from the Source?
8

Who Uses AAC?
People of all ages; ~2 million in US [NIH, 2000]
Developmental disorders:
Autism, cerebral palsy...
53% of people with CP use AAC [Jinks and Sinteff, 1994]

Neurological and neuromotor disorders:
ALS, MD, MSA, stroke, paralysis...
75% of people with ALS use AAC [Ball, 2004]
9
AAC stands for Augmentative and Alternative Communication and is primarily used by people for whom...

Functional Definitions
Target users are primarily non-speaking and may have upper limb motor impairments
Target users may also have developing literacy or language impairments
10

Types of AAC
Physical Boards
Electronic Systems
Letter-Based
Icon-Based
11

Types of AAC
Physical Boards
Electronic Systems
Letter-Based
Icon-Based
12

On Speed of Communication

Speech is often 150 - 200 words per minute
[Beasley and Maki, 1976]
vs.

Typical AAC is < 20 words per minute
[Higginbotham et al, 2007]
13

Modern AAC Application
14
SpeakForYourself, an icon-based AAC application for iOS and Android

The Problem
15

What is the Goal?
Make AAC more intelligent
"Intelligent" meaning:
User-specific
Adaptive
Context-sensitive
16

How?
By addressing some common assumptions:
Prescribed Order
Intended Set
Discrete Entry
17

Assumption 1: Prescribed Order
Users will select items in a specific order, such as the syntactically "correct" one.
Users do not always select items in expected order [Van Balkom and Donker-Gimbrere, 1996]
Using AAC devices is slow [Beukelman et al, 1989; Todman, 2000; Higginbotham et al, 2007]

Assumptions of diminished capacity
18

Assumption 2: Intended Set
Users will select exactly the items that are desired -- no fewer or more.
Motor and cognitive impairments may result in missing or additional selections [Ball, 2004]
Letter-based text entry systems detect accidental and missing selections
19

Assumption 3: Discrete Entry
Users will make discrete movements or selections, either physically or with a cursor.
Some letter-based systems have started to remove this assumption [Goldberg, 1997; Kristensson and Zhai, 2004; Kushler and Marsden, 2008; Rashid and Smith, 2008]

Many input signals are naturally continuous
20

The Goal
21

Part 2:

Theoretical 
Contributions
22

Theoretical Contributions
Semantic Frames,
Semantic Grams
Semantic Grams,
Contextual Prediction
Personalized Interaction
Prescribed Order
Intended Set
Discrete Entry
23

Theoretical Contributions
Semantic Frames,
Semantic Grams
Semantic Grams,
Contextual Prediction
Personalized Interaction
Prescribed Order
Intended Set
Discrete Entry
24

Addressing Prescribed Order
Statistical MT [Soricut and Marcu, 2006]
Semantic frames, CxG, and PAS [Fillmore, 1976]

Give ( Agent, Object, Beneficiary )
WordNet, FrameNet, "Read the Web" (NELL), Groningen Meaning Bank

Computationally intense to obtain statistics
25

Motivating Questions
Can we create a simple and fast language model for use with semantic frames?
Current completion and prediction strategies rely on syntactic order and word distance
N-grams, s-grams, skip-grams, CVSMs, etc.
Compansion [McCoy et al, 1998]
Memory-based LMs [Van Den Bosch and Berck, 2009]
Can utterances be predicted/completed without assuming order and distance?
26

Motivating Examples
Prior Input: play, video games, i, brother
Output: "My brother and I play video games."

Prior Input: play, chess, i, dad
Output: "I play chess with my dad."

Input: i, brother, ...
Output: ?
27

Possible Approach
Sentences are one of the smallest units of language that are:

Semantically coherent
Semantically cohesive
Syntactically demarcated

How can they be leveraged for prediction?
28

Semantic Grams
A multiset of words that appear together in the same sentence.

"I like to play chess with my brother."
brother, chess (1)
brother, i (1)
brother, like (1)
brother, play (1)
chess, i (1)
chess, like (1)
chess, play (1)
i, like (1)
i, play (1)
like, play (1)
29

More on Sem-Grams
Sentence Boundary Detection (SBD) is fast and relatively accurate (> 98.5%)
Sentences provide dynamic context windows
Sentence-level co-occurrence with uniform weight applied to all relationships in a sentence
30

Sem-Grams Study
Blog Authorship Corpus
140 million words from 19,320 bloggers
Age range of 13 - 48; balanced genders

Split by authors: 80% training, 20% testing
2 n-gram and 2 sem-gram algorithms
Naive Bayes: N1 and S1
N2 (weighted adjacency) and S2 (full independence)
31

Method
For every test sentence:
Process (split, stop, stem, and check)
Shuffle stems
Remove one (target)
Query each algorithm for missing stem (ranked list)

Evaluation: random 2000 sentences

Score: position of target (lower score is better)
32

Results: Example 1
Original: “This semester Im taking six classes.”

Target Stem: class
Input Stems: take, semest, six

N1 Candidate List: next, month, class, hour, last, second, week, year, first, five, flag, ...

S1 Candidate List: class, month, year, last, time, one, go, day, get, school, will, first, ...

33

Results: Example 2
Original: “Hey, they’re in first, by a game and a half over the Yankees.”

Target Stem: game 
Input Stems: yanke, hey, first, half 

N1 Candidate List: game, stadium, like, hour, time, year, day, guy, hey, fan, say, one, two, ... 

S1 Candidate List: game, got, like, red, time, play, team, sox, hour, go, fan, one, get, day, ...
34

Results: Example 2
Original: “Hey, they’re in first, by a game and a half over the Yankees.”

Target Stem: game 
Input Stems: yanke, hey, first, half 

N1 Candidate List: game, stadium, like, hour, time, year, day, guy, hey, fan, say, one, two, ... 

S1 Candidate List: game, got, like, red, time, play, team, sox, hour, go, fan, one, get, day, ...
35
To further demonstrate the difference between these two approaches, I've highlighted some of the words here...

Results: Performance of Sem-Grams
36

Summary of Sem-Grams
Simple, "fast" (SBD), and distance-agnostic

More accurate than similar n-gram-based algorithms
Alternative to more complex methods
Natural fit for use with semantic frames
37

Theoretical Contributions
Semantic Frames,
Semantic Grams
Semantic Grams,
Contextual Prediction
Personalized Interaction
Prescribed Order
Intended Set
Discrete Entry
38

Improving Unordered Prediction
Dropping assumption of order results in information loss
How can we compensate?
Devices often ask for user demographics
Mobile AAC devices have sensors:
Date
Time
Location
39

Motivating Questions
Almost all statistical LMs require background probabilities (priors)
Most systems use Google's N-Gram Corpus, Wall Street Journal, or New York Times
How much closer to a real user's priors can we get by leveraging context?
40

Contextual Prediction
23-year-old female in Seattle
23-year-olds
Global
Seattle
23-year-old females
41

Contextual Prediction Study
Blog Authorship and Yelp Academic Dataset
Contexts: age, gender, day of the week, day of the month, month, city, and state
Map unigrams to contexts for all authors; minimal stops and no stemming
Attribute
Blog Authorship
Yelp
Authors
19,320
130,850
Features
525,253
134,199
42

Method
Split by authors: 90% training, 10% testing
For every test author's unique context:
Obtain the true distribution (target)
Compare to distribution from each predictor combo based on non-target 9 folds

Metrics: Kullback-Leibler Divergence, Cosine Similarity, and Precision@20
43

Method Example
Target Distribution
Age: 23
Gender: Female
DOW: Monday
DOM: 25 - 31
Month: July
City: Seattle
State: Washington
Predictor Combos
Age
Gender
DOW
Age + Gender
Month + City
Age + Gender + City
...
(48 in total)
44

Results: Predictors by Metric
. . .
(No Context)
47
31, 27
(No Context)
KL Divergence
Rank
CosSim & Prec@20
DOW+DOM+Month+City
1
Gender+DOM+Month
Age+Gender+DOW+DOM+Month
2
Gender+Month
Age+DOW+DOM+Month
3
Age+Month
DOW+DOM+Month+State
4
Gender+DOW+Month
DOW+Month+City
5
Age+Gender
Age+Gender+DOW+Month
6
Age
DOM+Month+City
7
Age+DOM
45

Summary of Context
Contextual distributions can be more accurate than global statistics
Location better by KL; demographics better by CosSim and Prec@20
Some combinations consistently better:
Gender + DOM + Month
Age + Gender + DOW + Month
Age + Gender + DOM
Age + Month
46

Theoretical Contributions
Semantic Frames,
Semantic Grams
Semantic Grams,
Contextual Prediction
Personalized Interaction
Prescribed Order
Intended Set
Discrete Entry
47

Addressing Discrete Entry
Physical path or signal characteristics
Rotated unistroke recognition [Goldberg, 1997]
Letter-based paths [Kristensson and Zhai, 2004; Kushler, 2008]
Relative positioning [Rashid, 2008]
Well-received by non-disabled users
48

Motivating Questions
Modern AAC now deployed on touchscreens
Increasing research on accessibility
Fitts and Steering Laws [Fitts, 1954; Accot and Zhai, 1996]
Swabbing/sliding is easier  [Wacharamanotham et al, 2011]
Buttons need to be bigger [Chen et al, 2013]
What about functional compensation?
Can we learn realistic, layout-agnostic interaction patterns for an individual user?
49

Motor Optimization GUI (MoGUI)
50

MoGUI Example
51

MoGUI Study
Residents at the Boston Home
Current and potential AAC users
10 females and 5 males
Ages 35 - 71 (mean of 56)
8 right-handed; 7 left-handed (3 due to MS)
2 cross-balanced sessions: taps vs. slides
4x4 grid = 16 locations
Pseudo-random shuffling (a la Latin Squares)
52

Method
10.1" Android tablet in comfortable, landscape position; fully reachable
Choice of finger or stylus
10 levels of 3 rounds each
1, 2, 3, ...10 balloons per round = 165 total
Track all hits, misses, and timing
53

Results: Variability of Tap Misses
54
Multiple Taps
Fingers Dragging
Hand Resting
Thumb Usage

Results: Locations by Handedness
Left
Right
Mean speed-to-target in pixels/second
55

Results: Directions by Handedness
Mean speed-to-target in pixels/second
Left
Right
56

Summary of Personalization
Sliding not significantly faster than tapping for arbitrary targets; no motor learning
16% accidental slides; 43% accidental taps
High variance in individual motor patterns; weak correlations by handedness
Gamified calibration
Static improvements through personas:
Handedness → margins, button locations
Tap/slide preferences → input sensitivity
57

Part 3:

Applied 
Contributions
58

Applied Contributions
Free Order,
Discrete Icons
Free Order,
Continuous Icons
Mobile,Mixed-Input Letters
RSVP-iconCHAT
SymbolPath
DigitCHAT
59

A Collaborative Effort
Locked-In Syndrome (LIS)
Spinal injuries, ALS, tumors, strokes...
1% of ischemic strokes [Smith and Delargy, 2005]
Icon-based, switch AAC for people with LIS
Dr. Deniz Erdogmus and Dr. Rupal Patel
Minimal switch/signal requirements (1+)
Goal of a brain-computer interface (BCI)
Verb-first message construction [Patel et al, 2004]
60

Rapid Serial Visual Presentation
Used in psychology, speed-reading, lie detection, and letter-based BCI [Orhan et al, 2012]
61

RSVP-iconCHAT
62

63

64

65

66

67

68

69

70

71

72

73

74

75

Observations
Prediction/ordering controls speed of message construction
Natural fit for prediction via semantic grams
Required screen space is now tied to message complexity
76

RSVP-iconCHAT Study
24 non-disabled participants (ND)
14 females and 10 males
Ages 19 - 43 (mean of 24)
4 participants with speech and motor impairments (SMI)
2 females and 2 males
Ages 33 - 56 (mean of 41)
Space bar as switch mechanism
Up to 106 words in alphabetic order
77

Method
For every participant:
Introduction and 3 training cards
Shuffle 30 picture cards
Use the system to describe each card
RSVP starting at 700ms; adjustable at any time
78

Results: Construction Time
79

Overview of Results
Average speed of last 5 utterances:
70s (ND) vs. 107s (SMI)
No nonsensical utterances
Average of 5 selections (verb + 4)
RSVP speeds w/ positive motor response:
700ms (ND) vs. 1200ms (SMI)
80

Summary of RSVP-iconCHAT
Immediately applicable to mobile systems
Message complexity can be scaled (personalized)

Exandable to multi-modal or analog input:
Push the switch harder to go faster
Directional switches
"Oops" functionality

Involuntary responses (BCI) could leverage predictive reordering via sem-grams
81

Applied Contributions
Free Order,
Discrete Icons
Free Order,
Continuous Icons
Mobile,Mixed-Input Letters
RSVP-iconCHAT
SymbolPath
DigitCHAT
82

SymbolPath Motivation
83

SymbolPath
"I need more coffee"
84

Summary of SymbolPath
Designed for people with upper limb motor impairments or developing literacy
Semantic grams reweighted by path contour
75+ active users on Android
Regular email feedback: "It's fun!"
Drawing and syntactic completion/generation encourages fuller utterances
85

Applied Contributions
Free Order,
Discrete Icons
Free Order,
Continuous Icons
Mobile,Mixed-Input Letters
RSVP-iconCHAT
SymbolPath
DigitCHAT
86

DigitCHAT Motivation
87

DigitCHAT
Word-by-word, real-time construction
Mixed-mode input and active learning
88

Summary of DigitCHAT
Scalable and fast (> 45 WPM) [Silfverberg et al, 2000]
Compare to < 20 WPM for most AAC systems
15+ active users on Android
Winner of the ACM ASSETS 2014 Text Entry Challenge
89

Projected DigitCHAT
Head-tracking prototype by Dan Lazewatsky and Bill Smart 
(Oregon State University)
90

Part 4:

Summary
and
Conclusion
91

Thesis (Redux)


"Intelligent interfaces can mitigate the need for linguistically and motorically precise user input to enhance the ease and efficiency of assistive communication."
92

Theoretical Contributions
"...mitigate the need for linguistically and motorically precise user input..."
An unordered language model that bridges syntax and semantics. [Wiegand and Patel, 2012A]
An empirical comparison of contextual language predictors. [Wiegand and Patel, 2015B (R1)]
A motor movement study with current and potential AAC users. [Wiegand and Patel, 2015A]
93

Applied Contributions
"...to enhance the ease and efficiency of assistive communication."
A semantic approach to icon-based, switch AAC. [Wiegand and Patel, 2014B]
A continuous motion overlay module for icon-based AAC. [Wiegand and Patel, 2012B]
Mobile, letter-based AAC that supports conversational speeds. [Wiegand and Patel, 2014A]
94

Revisiting the Goal
95

Revisiting the Goal
96

Special thanks to the Continuous Path Foundation and the National Science Foundation (Grants #HCC-0914808 and #SBE-0354378).

Thank you for listening!karlwiegand.com/defense
97

98

Sem-Grams: Method Details
Test sentences truncated to 20 words

All algorithms seeded with top 10 type-specific grams for each input word

Maximum of 190 candidate words to rank

Absence of target word in list was considered a "failure to predict"
99

Sem-Grams: Overview of Results
N1
N2
S1
S2
# of Sentences
2000
2000
2000
2000
# Predicted
647
649
435
435
Average Score
16.26
19.70
9.04
12.67
100

Sem-Grams: Performance
101

Context: Method Details
Predictor
Blog Authorship
Yelp
Age
26
-
Gender
2
-
Day of the Week (DOW)
7
7
Day of the Month (DOM)
31 (4)
31 (4)
Month
12
12
City
-
119
State
-
16
Average of 18 unique contexts per author in Blog Authorship and 4 in Yelp Dataset

102

MoGUI: Observations
Varied tablet and hand/arm positions
Tablet being held, flat/tilted on lap, on desk, tilted on table, held in wheelchair mount
Use of fingers, thumb, stylus, and knuckles
Ghost tapping, spastic tapping, stylus friction, and finger humidity
Repeated margin activation and triggering of Google Now functionality
103

Brain-Computer Interfaces (BCI)
http://www.emotiv.com/
http://www.neurosky.com/
104
Emotiv Epoc on the left and Neurosky's MindWave on the right

The P300 Wave
105

Complexity vs. Real Estate
106

RSVP-iconCHAT: Construction Time
107

108

109

110

111

RSVP-iconCHAT: Feedback
All users get restless w/ alphabetic ordering
Even alphabetic ordering can be surprising
All users with SMI asked about other switches and multi-modal methods
All users favorably mentioned the automatic syntax generation/modification
112

113

114