The Center Project

In search of attractive AI sounds: A acoustics – cognitive science research

  • Participating professor: Ji-Eun Song
  • Project duration: Nov 1, 2022 ~ Oct 31, 2023 (12 months) – Year 2
  • Project objective: Quantify the preference over synthetic speech, identify the impacted acoustic factors, and establish a foundation for objective quality assessment of synthetic speech
    • Measure the preference and listening efforts over synthetic speech used by AI sound services via cognition experiment

    • Statistically identify the acoustic and sound factors impacted by the preference and listening efforts over synthetic speech

    Identify acoustic factors that impact the assessment of preference over synthetic speech
    • Year 1

      Focus on natural speech

    • Year 2

      Extend to "synthetic speech"

    • Build an objective assessment method

      Overcome the shortcomings
      the MOS (Mean Opinion Score) method

    • Build a foundation to develop synthetic speech
    • Relationship between acoustic characteristics and listening effort

      Minimize fatigue when listening to synthetic speech

  • Project execution: Methods of measuring preference and listening efforts for synthetic speech and extraction of acoustic factors
    • Extract synthetic speech from KT AI voice service (ex. AI call assistant) and conduct acoustic analysis (i.e., voice characteristics, articulation speed, vowel dispersion) and cognition experiment.

    • Expand the acoustic analysis and cognition experiment approach developed in year 1 to the research of synthetic speech.

    • Identify the acoustic factors of synthetic speech that impact the preference or fatigue of the listener via statistical analysis (such as Principal Component Analysis), and objectively assess the quality of synthetic speech and establish standards.

    • Closely work with KT’s voice synthesis team to conduct acoustics – cognitive science research that can help to improve the quality of synthetic speech.