SOAPS: Accessible voice CAPTCHAs for Internet Telephony


  • CAPTCHAs help protect services from automated requests
  • Internet Telephony (VoIP) is becoming popular -> risk of voice spam
  • Could CAPTCHAs be used to prevent VoIP spam?
  • Callers not on a whitelist or blacklist would have to prove they are human before calling a recipient

Research questions:

  • Can CAPTCHAs work as a spam prevention mechanism?
  • Can CAPTCHAs be adapted to telephony:
  • Wide variety of phones and networks
  • How will users react? Are solutions sufficiently usable and accessible?


  • Unknown callers diverted to CAPTCHA server, on pass transferred to actual call to recipient. Recipient can update access list to prevent future CAPTCHAs (or calls)
  • Note: John (initiator) doesn’t need any additional hardware or software.


  • Implemented test framework for Skype
  • 5-digit audio CAPTCHA with no distortions (ie, not secure, emphasis on testing user reactions)


  • 10 participants, varied length of instructions given to users as a subgroup condition (2 lengths)
  • Input methods: laptop keyboard and plug-in “mobile phone”


  • Most were surprised about the CAPTCHA (not informed ahead of time)
  • Majority of users passed on the first try
  • Users in short-instructions group made more mistakes
  • The overall grade of easiness was high
  • After multiple tasks, users’ “pleasant” grade decreased.

Usability challenges:

  • Becomes boring quickly, but skipping instructions takes extra instructional time.
  • Callers had difficulty comprehending all of the information at once (ie, * key to submit, # to reset).
  • What works with one device can be unusable on another (ie, gaps between digits needs to be large for mobile phone users).

Design improvements:

  • “Press any number key”, eg, when a certain sound is heard
  • “Press any key n times”
  • Redesign cancel function (or remove it and have user retry on failure)
  • Design instructions carefully, present in users native language. More information may be necessary, despite desire to limit length of interaction.

Next steps:

  • Test with different user groups: aged, non-native speakers of the CAPTCHA language
  • Is the CAPTCHA feasible for spam protection?


Q:Was the task digit-at-a-time or all digits at once?
A: Task was digit-at-a-time.


SOAPS: An improved audio CAPTCHAs

Jennifer Tam, Jiri Simsa, David Huggins-Daines, Luis von Ahn and Manuel Blum

CAPTCHAs – a test to determine if the user is human.
Some existing audio CAPTCHAs have only a 70% human passing rate, because the additional noise injected into the audio makes discerning the digits difficult.
Additional concern: task time is much greater than with visual CAPTCHAs.

Are current CAPTCHAs secure?

  • Considered insecure if can be beat 5% of the time.
  • Because of the limited vocabulary, a trained system can beat them 45% of the time.

Testing targetted at: Google, reCAPTCHA, digg
Sampled 1000 from each

Algorithm to break them:

  • Segment audio
  • Features: classify as digit/letter, noise, or voice


  • Manually segmented/labelled.
  • Testing used an automatic segmenting algorithm

Feature algorithms:

  • Mel-frequency cepstral coefficients (MFCC)
  • Perceptual linear prediction (PLP)
  • Relative spectral transform with PLP (RASTA-PLP)

Trained with AdaBoost, SVM, k-NN
Algorithm: segment, recognize (features -> labels), repeat until all segments or a maximum solution size

  • 66% Google, 45% reCAPTCHA, 71% Digg. These are for exact matches, rates are higher if errors are allowed (at least Google permits 1 error in the response).

How to build a better audio CAPTCHA?

  • Apply reCAPTCHA’s visual approach to audio techniques.
  • Similar to visual reCAPTCHA , transcribe audio that failed Automatic Speech Recognition, but with audio that is spoken clearly.

How will it work?

  • Start with phrases with known transcriptions.
  • User will transcript adjacent phrases to transcribe.
  • Un-transcribed phrase’s transcription is recorded after the known-phrase transcription is matched.

Security Analysis:

  • Speaker independent recognition and open vocabularies are difficult for ASR systems.
  • AM broadcast and mp3 cause coding degradation which also reduces performance.


  • Improved accessibility for RECAPTCHA
  • Provide transcriptions for non-transcribed audio

Q: How will the bad guys respond to this new technique?
A: Will be collecting data as it runs, detect weak bits and remove them from the system. Should be possible to stay ahead of the bad guys (by having more complete data). Different radio show sources will provide different background patterns which would need to be segmented in the bad guy’s training data, as well.
Q: Radio shows pushed the development of “widely understandable” accents. Does this make them particularly vulnerable to computer attack?
A; Not clear this will be a problem, as there were various accents encouraged in the shows, among other reasons.
Q: What about language barriers?
A: Eventually include audio sources from other languages, perhaps chosen by location or menu selection.

Q:How do you plan to clean the data for spelling, punctuation, etc.
A: Currently, ignore case and punctuation (for comparison).
Also planning to deploy a dictionary for cleanup/comparison.
Q: Can users poison the system?
A: There are statistical tests for ultimate acceptance of a given transcription, as well as other techniques.
Q: What about the deaf-blind users?
What techniques for computer use are available?
A: ASCII output/keyboard input.
A: Would it be possible to use a haptic device with a waveform?
Q: To deal with dyslexia, perhaps combine the audio with a visual representation?
A: Runs the risk of giving the computer more leverage as well, and there are no transcriptions for the audio to use to generate the visual representation.


Talk: Research Directions for Network Intrusion Recovery

Authors: Michael Locasto, Matthew Burnside, Darrell Bethea

Realized these areas affect people more than themselves and would like feedback on these topics and research. Network intrusion and discovery is underappreciated on the recovery side, seen as boring system administration work.

Focus: usable systems for intrusion response. One benefit is an incident archive. Orgs & people have disincentive to share incidents, bad PR, so there’s no archive to research to see which problems to focus on. Network intruction recover is difficult for researching, designing & creating usable security mechanisms.

Started logging incidents, march 07, dec 07, march 08. Talking about dec 07 rest are in the paper.

Graphics research group in CS got 4 new machines with nvidea cards and unofficial drivers. IT staff installed non-standard drivers all was good. 12 months later, machines crash on 12/6/07. Added to ticketing system, IT staff rebooted them everything seemed fine, Monday crashed again on 12/10. Finals week – didn’t start diagnosis until 12/13. Two rdist masters, and they start to crash as well on Monday 12/10. Recent kernal upgrade – roll back. Crash again on 13th and 14th. On the 17th, compile the kernal not just apply the binary. The make failed because it could not create directories that just have numbers. Sounds like – rootkit, might be intercepting file ids. This is the first time we think it might be a security issue. Booting from CD shows that common utilities were replaced. Every machine managed with rdist — 200 machines! — has been compromised. And then the staff goes home for the holidays at the end of the week. Plus, Friday the 21st half the staff is leaving for new jobs. Switched OS to a different Linux flavor, changed everyone’s password and sent text messages to everyone to go out-of-band.

Lessons learned

There was no recovery agenda. Multiple conflicting points of view. Masters students running much of the show, no one’s there long term. Decisions are informal and qualitative: why switch from RedHat? The swaying argument was the person doing the install was “comfortable with the package management system.” Why is that the right factor for security-related decisions? But the RedHat advocates had moved on from the group, so now people wanted to install what they knew. How do you create and update a plan in the face of so much churn? Reviewing once a year isn’t going to be enough. How do we do this in a useable and efficient way? Human memory is pretty bad. People involved in multiple incidents confused what happened when. There wasn’t clear record keeping. IDS systems don’t work. The rootkit conflicted with unofficial video drivers, the machines crashed. In another incident, NFS mount failed. Even when Snort is turned on, who’s going to look at 500 messages a day? The infrastructure is weak. And the human level issues complicate things even further.

Tension about forensics: do you keep a machine up or take it down?

Staff and ISP might want to take the machine down you don’t want a reputation as spreading a worm. But might want to keep it up to figure out what’s going wrong and to be able to fix it. Users want to stop the threat to their privacy, but if it’s a critical machine — or during finals — it may not be possible to take down. 8 months later there are still machines vulnerable to the same attacks.

Research directions

  • Not just technical but human problems. First approach, bulleted list of what could be possible. But doesn’t get across interactions. Used Tufte as a starting point for visualizing a “decision surface” to help plan out activities and see where complexities lie.
  • Predict latent vulnerabilities based on what you’re already learned.
  • Recording infrastructure with “recovery trees” and figure out how to integrate with current tasks, need a system woven into the infrastucture
  • Technical comparisons of alternatives: NLP on release notes, query bug databases, etc.


Community should focus on creating mechanisms that deal with recovery as a system of both humans & computers

Q&A: recovery trees could help with things like where are the LDAP servers and what happens if they fail. A: Need to know how things work now, and that’s hard with 25 years and no notes. You need a system that can figure out where things are.

Q: interesting when stuff breaks, we giggle when grandma says “my computer doesn’t work it must be a virus.” How many times does stuff break that isn’t security? A: Don’t know, probably most are not security. You dig when you find a symptom, maybe network is slow.

Q: when building a db of incidents, two problems: kind of problems people have may be so different they can’t find anything useful; organizations aren’t highly willing to share. A: People do experience the same sorts of things so there is value to compare notes. Also value for research, especially as people bring new tools into play, can evaluate them. Second part: have to get friendly with sysadmins. They’re willing to share them, you have to talk to IT directly.
Q: incident response varies by inside or outside threat. Any data on %s? In your case, was it inside or out? A: we don’t know suspect outside. Don’t have data. Even defining what an insider is gets hard. See verizon report:

Q: was the driver the threat? A: no, just the canary that showed the rootkit.


SOAPS: Challenges in Universally Usable Privacy and Security

Presented by Harry Hochheiser.

User diversity (young, old, varying motor skills), technology diversity, and context of use (home or office environment, physical factors, social factors) impact the way that people can interact with systems.

Security and privacy mechanisms require users to “jump through hoops” to prove themselves (or pay attention to things they’d rather disregard).

  • Additional information (security indicators)
  • Additional tasks (email encryption)
  • Harder tasks (passwords, CAPTCHA)

These mechanisms raise accessibility barriers.

Anti-phishing tools

  • These tools depend on the site content and cues available in the browser; elements that are inaccessible in screen reading software.
  • Features of the tools may be hard to understand by seniors, the visual impaired.


  • Remember passwords, manage multiple accounts – may be difficult for people w/ cognitive disabilities and physical disabilities


  • Visual CAPTCHAS and Audio CAPTCHAS may be difficult for people who are visually impaired, or in loud environments.

Several tools exist to check for accessibility, but all the tools will give you different results from each other. There is a lack of really good tools to help developers check. The tools themselves are not enough; screen reading software should also be used; else, the developer should check by turning the screen off and not using a mouse and seeing if they can still navigate.

Possible approaches for universally usable privacy and security:

  • User diversity
    – Providing alternative forms of content (cons: may curb effect, incur high dev and maintenance costs
    – Development of a single system that is accessible by diversified populations.
  • Gaps in user knowledge
    – Development of easily understandable vocabulary and icons
    – Transparent system actions
    – Better training
  • Technology diversity
    – Consideration for small displays
    – Consideration for small input devices

(Audience: Universally Usable designs benefit _everyone_. )

Running user studies: try diverse users groups to find out how and why people are falling for phishing attacks.


SOAPS Opening Session

It’s a treat this year to have the Symposium on Accessible Privacy and Security (SOAPS ’08) as one of the SOUPS workshops.

Welcome: Harry Hochheiser.

Mark Riccobono (National Federation for hte Blind)- Security and Accessibility: A Single Track for Innovation

Internet security – prevention of unauthorized access to computer systems via Internet access.
Accessibility – Used to describe the degree to which a product is usable by as many people as possible.

People see these two things in conflict.
– Prevent access from unauthorized vs.
– Expand ability for as many people as possible to access functionality of systems

How do we limit and widen access at the same time?

This contradiction is the gap we wish to bridge.

For the blind, the Internet provides a great deal of access. We can now access functionality from data that used to be difficult to obtain (screen readers).

Exploding Internet = dearth of knowledge to developers of knowledge of accessibility and design, and security for the blind.

– Non-visual information presented on screen.
– Internet promises blind greater independence (finances – online banking, and previously paper based)
– Due to lack of eyesight, the blind are often deemed unauthorized and denied access to information, if only they could see. The blind are denied access only b/c of security.

We need to ask ourselves whether or not security and access. are opposite concepts. NFB believes that security and accessibility should be a single track for driving innovation rather than parallel tracks creating frustration w/ all involved.

Due to the desire of blind people to access the Internet, we must extend Internet services. Lots of technology initiatives have been implemented to accomplish this. The NFB is building some innovative partnerships w/ Oracle and Towson University (i.e. a Non-visual web certification program).

Security and accessibility are a single track and equal parts of the equation, the solution is an innovation that is greater than the sum of the parts.


SOUPS 2008!

Welcome to SOUPS 2008, in Pittsburgh, PA. We will be live-blogging the SOUPS conference.

To keep track of the program, visit

So far, the weather is great as is this year’s program.

The favor this year is a clear round bar of soap with the SOUPS logo embedded in it.