Konstantin Sering | Universität Tübingen

Von der Physik und der Psychologie kommend liegt mein Hauptaugenmerk auf der Produktions von natürlicher menschlicher Sprache durch das Sprachorgan des Menschen. Dabei fasziniert mich, dass die allermeisten Menschen innerhalb der ersten Lebensjahre lernen gesprochene Sprache als Kommunikationsmittel in unterschiedlichsten Umgebungen und Situationen meistens effizient und zielgerichtet einzusetzen. Dabei legt unser menschliches Sprachorgan eine faszinierende Koordination und Dynamik an den Tag. Für mich liegt der diskriminierende, informationstragende Charakter von Sprache näher als der bedeutungstragende. Soweit ich das System Sprache bisher verstehe, erscheint mir der "Transfer der Information" das wesentliche Merkmal zu sein. Konkret bedeutet das, dass Sprache in meinen Augen z. B. signalisieren kann, ob ich eine Situation angenehm oder unangenehm finde oder dass ich etwas verstanden habe oder noch nicht verstanden habe. Dieses Signalisieren setzt aber immer einen richtigen Kontext voraus und das gleiche Signal hat damit selbstverständlich unterschiedliche Bedeutungen in unterschiedlichen Kontexten.

Methodisch nähere ich mich dem Thema gesprochener Sprache beim Menschen gerade von der computergestützten Modellierungs- und Simulationsseite. Dazu versuche ich ein Computermodell des menschlichen Sprachapparats, welches von Peter Birkholz in Dresden entwickelt wurde (VocalTractLab) , so zu erweitern, dass es menschliche Spontansprache produzieren kann. Um die richtigen Trajektorien der Kontrollparameter, die das Vokaltraktmodell steuern, zu finde, verbinde ich einfache künstliche, neuronale Netze mit einem "predictive forward" Ansatz, der durch situatives Planen den Fehler in verschiedenen Zielräumen minimiert. Diese Zielräume sind mindestens ein Bedeutingszielraum und ein akustischer Zielraum. Ein wesentlicher Teil meiner Arbeit resultierte im PAULE (Predictive Articulatory speech synthesis Utilzing Lexical Embeddings) Modell (Doktorarbeit, Python Code). Hier kommt mir zu Gute, dass ich auf die Resourcen und Expertise der Kognitive Modellierungs Gruppe um Martin V. Butz und die sprachliche und statistische Expertise unsere Quantitativen Linguistik Gruppe um Harald Baayen zurückgreifen kann.

Veröffentlichungen

Konstantin Sering. Speech/non-speech classification slightly improves synthesis quality in PAULE. In Elektronische Sprachsignalverarbeitung 2024, pages 173–180, 2024.

Konstantin Sering. Predictive articulatory speech synthesis utilizing lexical embeddings (PAULE). PhD thesis, Universität Tübingen, 2023.

Konstantin Sering and Paul Schmidt-Barbo. Somatosensory feedback in PAULE. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2023, pages 119–126, 2023.

Karen V Beaman and Konstantin Sering. Measuring change in lectal coherence across real-and apparent-time. In The Coherence of Linguistic Communities, pages 87–105. Routledge, 2022.

Paul Schmidt-Barbo, Sebastian Otte, Martin V. Butz, R. Harald Baayen, and Konstantin Sering. Using semantic embeddings for initiating and planning articulatory speech synthesis. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2022, pages 32–42, 2022.

Konstantin Sering and Paul Schmidt-Barbo. Articubench - an articulatory speech synthesis benchmark. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2022, pages 43–50, 2022.

Karen V Beaman, Fabian Tomaschek, and Konstantin Sering. The cognitive coherence of sociolects across the lifespan: A case study of swabian german. 2021.

Jakob Fink-Lamotte, Andreas Widmann, Konstantin Sering, Erich Schröger, and Cornelia Exner. Attentional processing of disgust and fear and its relationship with contamination-based obsessive–compulsive symptoms: Stronger response urgency to disgusting stimuli in disgust-prone individuals. Frontiers in psychiatry, 12, 2021.

Paul Schmidt-Barbo, Elnaz Shafaei-Bajestan, and Konstantin Sering. Predictive articulatory speech synthesis with semantic discrimination. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2021, pages 177–184, 2021.

Konstantin Sering, Fabian Tomaschek, and Motoki Saito. Anticipatory coarticulation in predictive articulatory speech modeling. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2021, pages 208–215, 2021.

Fabian Tomaschek, Denis Arnold, Konstantin Sering, and Friedolin Strauss. A corpus of schlieren photography of speech production: potential methodology to study aerodynamics of labial, nasal and vocalic processes. Language Resources and Evaluation, 55(4):1127–1140, 2021.

Konstantin Sering, Paul Schmidt-Barbo, Sebastian Otte, Martin V Butz, and Harald Baayen. Recurrent gradient-based motor inference for speech resynthesis with a vocal tract simulator. In 12th International Seminar on Speech Production, 2020.

Konstantin Sering and Fabian Tomaschek. Comparing KEC recordings with re-synthesized EMA data. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2020, pages 77–84, 2020.

Fabian Tomaschek, Denis Arnold, Konstantin Sering, Benjamin V Tucker, Jacoline van Rij, and Michael Ramscar. Articulatory variability is reduced by repetition and predictability. Language and speech, 2020.

Konstantin Sering, Niels Stehwien, Yingming Gao, Martin V Butz, and Harald Baayen. Resynthesizing the GECO speech corpus with VocalTractLab. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2019, pages 95–102, 2019.

Konstantin Sering, Petar Milin, Harald Baayen. Language comprehension as a multi-label classification problem. Statistica Neerlandica, 2018.

Denis Arnold, Fabian Tomaschek, Konstantin Sering, Florence Lopez, and Harald Baayen. Words from spontaneous conversational speech can be recognized with human-like accuracy by an error-driven learning algorithm that discriminates between meanings straight from smart acoustic features, bypassing the phoneme as recognition unit. PLoS ONE, 2017.

Konstantin Sering. Dispersion Forces – Numerical Methods for Casimir-Polder Potentials in Complex Geometries. Diploma thesis, 2014.

Konstantin Sering. Gaze Coherence – Improving and evaluating a spatio-temporal normalised scan path saliency approach. Diploma thesis, 2013.

Florian Wickelmaier, Nora Umbach, Konstantin Sering, and Sylviain Choisel. Comparing three methods for sound quality evaluation with respect to speed and accuracy. Convention Paper 7783 of the Audio Engineering Society, 2009.

Gudrun Tisch, Hedwig Seelentag, Leon Sering, Thomas Hinke, Konstantin Sering, Katharina Mölter, Phillip Urbanik, and Ole Schmidt. Fümo - Das Buch. Tenea, 2007.

Software

articubench (Author and Maintainer), Python package: An articulatory speech synthesis benchmark publishing publicly available data and own measurements with electromagnetic articulography and ultra sound tongue movement to compare different articulatory speech synthesis control models, https://github.com/quantling/articubench, since 2022.

paule (Author and Maintainer), Python package: Predictive Articulatory speech synthesis Utelising Lexical Embeddings (PAULE), a control model for the VocalTractLab speech synthesizer, https://github.com/quantling/paule, since 2021.

create_vtl_corpus (Co-Author and Maintainer), Python scripts to create and synthesize a speech corpus with VocalTractLab. https://github.com/quantling/create_vtl_corpus, since 2019.

pyndl (Co-Author and Maintainer), Python package that re-implements learning and classification models
based on the Rescorla-Wagner equations. https://github.com/quantling/pyndl, since 2016.

ndl2 (Maintainer), R package that implements learning and classification models based on the Rescorla-Wagner equations and their equilibrium equations. Mail me for copy, since 2016.

ndl (Maintainer), R package that implements learning and classification models based on the Rescorla-Wagner equations and their equilibrium equations., https://cran.r-project.org/web/packages/ndl/index.html, since 2016.

synchronicity (Author), Python package to calculate gaze synchronicity and coherence values for a group of viewers of a dynamic scene out of raw eye tracking data, https://github.com/derNarr/synchronicity, 2015.

segmag (Contributer), R package in order to determine event boundaries in event segmentation experiments, http://cran.r-project.org/web/packages/segmag/index.html, 2016.

achrolab (Co-Author), Python package to control and calibrate hardware in a achromatic color laboratory, https://github.com/derNarr/achrolab, 2012–2013

Präsentationen

Konstantin Sering. Predictive articulatory speech synthesis utilizing lexical embeddings (PAULE), 2021. Spoken Morphology Colloquium, Düsseldorf, Germany. Konstantin Sering. Learning vocal tract control parameters to synthesize speech, 2019. MoProc Workshop, Tübingen, Germany.

Konstantin Sering. Learning vocal tract control parameters to synthesize speech, 2019. Neural Information Processing Group, Tübingen, Germany.

Ingmar Steiner, Fabian Tomaschek, Timo Bolkart, Alexander Hewer, Stefanie Wuhrer, and Konstantin Sering. Head and tongue model: Simultaneous dynamic 3d face scanning and articulography, 2018. Simphon.net Meeting, Stuttgart, Germany.

Konstantin Sering. Mimicking to speak with a vocal tract model -- first ideas. Poster, MLSS, 2017.

Denis Arnold, Florence Lopez, Konstantin Sering, Fabian Tomaschek, and Harald Baayen. Acoustic speech learning without phonemes: Identifying words isolated from spontaneous speech as a validation for a discriminative learning model for acoustic speech learning. Talk, TeaP, 2016.

Jakob Fink, Andreas Widmann, Konstantin Sering, and Cornelia Exner. Attentional bias triggers disgust-specific habituation problems in subclinical contamination based obsessive-compulsive disorder. Poster, TeaP, 2016.

Konstantin Sering, Nora Umbach, and Dominik Wabersich. Achrolab – using non-python supported device for a vision lab. Poster, Euro SciPy in Paris, 2011.

Lehre

WS 2023

Introduction to Psycholinguistics

Grundvorlesung 4 SWS + 2 SWS Tutorial. Zusammen mit Motoki Saito gehalten.

SS 2023

Physics of human speech production (PHSP)

Hauptseminar 2 SWS.

SS 2023

Spoken Word Recognition

Hauptseminar 2 SWS.

WS 2022

Seminar on Predictive Articulatory Speech Synthesis

Hauptseminar 2 SWS.

WS 2020

PhD Seminar Semantic Embeddings

Unregelmäßiges Treffen mit anderen Doktoranden in dem wir über verschiedenen Formen von Semantic Embeddings gesprochen haben.

SS 2018

Mathematical Methods: Statistics

Siehe Beschreibung WS 2015 für Mathematics for Linguists: Statistics.

Der Kurs wurde vornehmlich von Elanz Shafaei in englischer Sprache gehalten und von mir begleitet.

SS 2017

Mathematical Methods: Statistics

Siehe Beschreibung WS 2015 für Mathematics for Linguists: Statistics.

WS 2016

Mathematics for Linguists: Statistics

Siehe Beschreibung WS 2015 für Mathematics for Linguists: Statistics.

SS 2016

Regression Modeling Strategies for the Analysis of Linguistic and Psycholinguistic Data

Regression Modeling Strategies ist ein für die die Computerlinguisten und Allgemeinensprachwissenschaftler wahlpflichtiger Kurs von drei Semesterwochenstunden. Hier werden die weiterführende Themen der statistischen Inferenz vermittelt. Insbesondere wird auf (genrealiseirte) lineare mixed effect models (gLMER), generalisierte additive Modelle (GAMMS) und survival analyse eingegangen.

Der Kurs wird zusammen mit Harald Baayen und in englischer Sprache gehalten.

WS 2015

Mathematics for Linguists: Statistics

Mathematics for Linguists: Statistics ist ein für die die Computerlinguisten und Allgemeinensprachwissenschaftler verpflichtender Kurs von zwei Semesterwochenstunden und zwei Stunden Übung. Hier werden die Grundlagen in Mengenlehre, Wahrscheinlichkeitstheorie, Zufallsvariablen, Verteilungen, Informationstheorie und statistischer Modellierung vermittelt.

Der Kurs wird in englischer Sprache gehalten.