Progress report September 1st - December 30th 2000

NEAT A230 - Speech Recognition for People with Severe Dysarthria

(aka STARDUST - Speech Training and Recognition for Dysarthric Users of Assistive Technology)

During the period 1st September to the 30th November 2001 considerable progress has been made at identifying how the project aims will be achieved with clarification being provided by a thorough review of the literature and formal discussions as a group.

As envisaged in the project proposal, three members of staff have been added to the project team:

Due to recent birth of her child, Dr Lynda Webb is only in a position to provide informal input and is no longer a member of the project team.

It was decided to appoint an external expert advisor to the project to give on-going peer review of progress. Professor Alan Newell from University of Dundee was approached but was unable to take up the position. Mark Tatham, Professor of Linguistics at the University of Essex and President of the UK Institute of Acoustics, was therefore approached and has agreed to be the external export advisor.

Progress with the initial three month targets

During the first three months the research plan identified three areas where efforts were to be concentrated. Namely:

1. Literature search update.

2. Begin recruitment of subjects and initial speech data collections.

3. Initial specification of demonstration applications.

1. Literature search update.

An initial literature search up-date was completed in October 2000. The main outputs from the initial literature review were:

The literature review is ongoing throughout the project to ensure that where possible the latest techniques are incorporated within the project. The initial literature review is provided in Appendix A.

2. Begin recruitment of subjects and initial speech data collections.

Prior to the commencement of the project MREC ethical approval had been secured and LREC applications for local approval in Barnsley, North Derbyshire, Rotherham and Sheffield (North and South) were submitted. By the end of November only Barnsley and North Derbyshire had granted approval.

Due to the time required to secure local ethical approval it was not possible to approach potential participants during the initial three months. However, preparation work was carried out. Key health service professionals with access to potential participants were identified and initial communications instigated. These were typically augmentative and alternative communication specialists, and personnel within the paediatric and sub-acute brain injury services.

Attention has also been given to the methodology of data collection consequently, instead of the initial 10 participants, it is hoped that 12 people will be recruited but in a two stage process. Initially, 2 or 3 people will then be ‘fast tracked’ before recruiting the remaining participants. This will provide a test bed for the data collection process and should therefore allow any difficulties in the methodology or practical issues of data recording to be addressed at an appropriately early stage. The methodology of data collection has been developed from the original research protocol as defined below:

  1. Test of intelligibility using the Frenchay assessment (Enderby, 1983) which is a development of the Yokston and Bukelman assessment, 1981.
  2. Random selection of the Kent words (Kent et al, 1989). The previous step ensures that the participants meet the eligibility criteria for the project, while this step provides detailed information regarding the ability of participants to communicate certain types of words.

Both of these steps will be performed in the initial assessment, while it is anticipated that further visits will include the Kent words and key words for the individual for example, door open.

In order to obtain accurate recordings for training of the speech recogniser it was felt that a mini disk recording (as envisaged in the research protocol) could be improved upon and therefore it is envisaged that the mini disk recording will be used as a back up system. Instead, a software recorder will be developed to ensure that the quality of the recording is not compromised during the transfer of data to the computer. This approach also allows the same microphone and computer to be used during the data gathering, training, and operation, which reduces the number of variables and may lead to a more robust system for the user. It also makes the user more familiar with the computer and microphone, which they ultimately will use during the training phase of the project.

3. Initial specification of demonstration applications.

By the end of November an initial specification of the demonstration applications was required with a formal specification to be completed by the end of January 2001 (Milestone 2). The original research protocol identified that applications were required in two areas, namely:

1. Voice output communication aid (VOCAs).

2. Environmental control system.

For VOCAs, one of the aims is to identify the most common phrases and sentences required by the intended users. Investigation has revealed that considerable research has been conducted into word frequency (see The British National Corpus word Frequency lists for more information, http://info.ox.ac.uk/bnc/) but the lists are for the general population and not targeted at communication aids. Communication with Speech and Language therapists has also failed so far to identify an appropriate list. At the end of the first three months efforts were ongoing but it was believed that due to the particular unique requirements of each individual that there were few phrases that would be appropriate to all users.

A review of the literature on Environmental Control Systems (ECS) also failed to identify commonly used assistive technologies and words. Therefore, an initial list was produced with the clinical engineers at Barnsley District General Hospital and is provided in Appendix B. Again, it is envisaged that each user in the project may have unique commands.

Milestone Plan

During the initial three months only one milestone was indicated in the research plan, namely the completion of an initial literature review. As previously indicated this milestone has been meet.

Formal group meetings

During this initial set-up period attention has been given to the direction of the project team to ensure that everyone is working towards the same goals, and that these goals reflect the research protocol. Therefore, while there has been considerable communication and debate between individuals, formal group meetings have also played a predominant role in the project structure. Indeed, during this three-month period there have been four formal meetings, the agendas and minutes of which are provided in Appendix C. Professor Phil Green also gave a workshop to the team on speech recognition techniques.

Conclusions

The project team has been assembled, is meeting regularly, and working well together. While for the period reported the project team have meet the objectives defined in the original research protocol.

The initial three months has indicated that there has been little published research in several areas that this project addresses. There is therefore a significant opportunity, not only to be able to assist the individuals in this project, but also the research community. There have been few publications into the:

Through the course of this project it is believed that we will be able to contribute to these areas and therefore make a significant contribution in the development of services and therapy, not just for people with severe dysarthria, but perhaps people with other speech impairments. As a way of disseminating information and knowledge a project web site has been created and will be updated when appropriate www.shef.ac.uk/~pdg/stardust.

References

Enderby P. (1983). "The Frenchay dysarthria assessment." College-Hill Press. San Diego.

Kent RD. Weismer G. Kent JF. Rosenbek JC. (1989). "Toward phonetic intelligibility testing in dysarthria." Journal of Speech Hearing Disorders. 54:482-99.

Yorkston KM. Beukelman DR. (1981). "Assessment of Intelligibility of Dysarthric Speech." Austin, TX: Pro-ed.

 

Statement of expenditure to 31st December 2000

 

Annual Anticipated

Actual (to Dec

31st 2000)

Staff (inc. increments and wage inflation)

(£’s)

Computer scientist

£21,490

£4,842.86 (1)

Speech therapist

£10,039

£3,242.29 (2)

Project co-ordinator

£6,239

£1,540.85 (3)

 

 

 

Equipment

 

 

Minidisk recorder

£100

- (4)

Laptop computers * 6

£5,700

£1,311 (5)

Speech recognition development system

£2,281

£2,387.75 (6)

 

 

 

Consumables (discs, paper etc.)

£500

£168 (7)

 

 

 

Travel

 

 

Conferences

£600

-

Visiting or paying clients to attend

£700

£77 (8)

 

 

 

Overheads

 

 

(40% of university staff cost)

£12,612

£3,233.66 (9)

(1% of NHS staff costs)

£62

£15.41 (10)

 

 

 

Total

£60,323

£16,818.82

Notes on Figures

  1. Programmer started 23 Oct, therefore 9/31 of Oct + Nov + Dec. Annual figure of £23,256.
  2. = (23,256/12) = 1,938 per month = 1,938*(9/31)+(1,938*2) = 4,439*1.09 (N.I.)= £4,842

  3. Speech therapist started 9th October therefore, 23/31 of Oct + Nov + Dec. However, an invoice has been raised from the 1st September 2000. Therefore the therapist has agreed to make this time up at the end of the project. For the period 1/9/2000 to 31/3/01 the invoice is £5,674. Therefore, a monthly figure is 5,674 / 7 = £811, for the period in question therefore the total paid is: £811* 4 = £3,242.
  4. The speech therapist employed on the project works 3wte, rather than 4wte as suggested in the research protocol. This is due to the therapist being more qualified than anticipated and therefore, to stay in accordance with the budget, the hours have been reduced. However, this is not envisaged as a problem as the therapist on the project is highly qualified and has a wealth of practical knowledge and extensive connections in the field.

  5. Co-ordinator started 4th Sept, therefore 27/30 of Sept + Oct + Nov + Dec. Annual figure of £20,261. = (20261/12) = 1,688 per month = 1,688*(27/30)+(1,688*3) = 6585*1.17 (17% overheads) = £7,704. However, they only work 1 day a week on this project , so 7,702 * (1/5) = £1,541.
  6. It was not necessary to purchase a mini disk recorder as it was decided to develop our own software recorder and a minidisk recorder has been leant to the project for back up recording purposes.
  7. One laptop and carry case for clients has been purchased at a combined cost of £1,311 (1 * Toshiba Satellite Pro 4310 =£1,273 + carry case £38)
  8. One laptop, carry case and screen, at a combined purchase price of £2,388. (1 * Toshiba Satellite 2800-300 (inc. additional 128MB RAM and network card) £1,895, 1 * carry case £38, 1 * viewsonic monitor £454.75)
  9. £168 for paper, discs and stationary.
  10. Travel expenses for speech therapist to 31st December.
  11. 40% of computer scientist and speech therapist salary.
  12. 1% of project co-ordinator salary cost for period = £15