Hierarchical-Task Reservoir for Online Semantic Analysis From Continuous Speech

Luca Pedrelli; Xavier Hinaut

doi:10.1109/TNNLS.2021.3095140

Hierarchical-Task Reservoir for Online Semantic Analysis From Continuous Speech

IEEE Trans Neural Netw Learn Syst. 2022 Jun;33(6):2654-2663. doi: 10.1109/TNNLS.2021.3095140. Epub 2022 Jun 1.

Authors

Luca Pedrelli, Xavier Hinaut

PMID: 34570710
DOI: 10.1109/TNNLS.2021.3095140

Abstract

In this article, we propose a novel architecture called hierarchical-task reservoir (HTR) suitable for real-time applications for which different levels of abstraction are available. We apply it to semantic role labeling (SRL) based on continuous speech recognition. Taking inspiration from the brain, this demonstrates the hierarchies of representations from perceptive to integrative areas, and we consider a hierarchy of four subtasks with increasing levels of abstraction (phone, word, part-of-speech (POS), and semantic role tags). These tasks are progressively learned by the layers of the HTR architecture. Interestingly, quantitative and qualitative results show that the hierarchical-task approach provides an advantage to improve the prediction. In particular, the qualitative results show that a shallow or a hierarchical reservoir, considered as baselines, does not produce estimations as good as the HTR model would. Moreover, we show that it is possible to further improve the accuracy of the model by designing skip connections and by considering word embedding (WE) in the internal representations. Overall, the HTR outperformed the other state-of-the-art reservoir-based approaches and it resulted in extremely efficient with respect to typical recurrent neural networks (RNNs) in deep learning (DL) [e.g., long short term memory (LSTMs)]. The HTR architecture is proposed as a step toward the modeling of online and hierarchical processes at work in the brain during language comprehension.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Brain
Neural Networks, Computer
Semantics*
Speech*