Project A03 (finished)

Discourse Strategies across Social Media: Variability in Individuals, Groups, and Channels

PI(s): Prof. Dr. Tatjana Scheffler & Prof. Dr. Manfred Stede

Language in social media is characterised by more formal (written-like) or more informal (spoken-like) style in different contexts, and thus shows high variability. In this project, we focus on one linguistic domain in pragmatics, the management of common ground between writers and readers, and identify the consistent patterns of discourse strategies employed by writers across different groups and channels. We explore three types of phenomena that relate to common ground management: question tags, coreferential expressions, and coherence markers.

Question tags are particles attaching to a typically declarative clause to yield a kind of confirmation request. We conducted an extensive corpus study investigating the contexts and functions of different question tag variants in German. We found significant differences between the functions of individual tags and in the use of tags across conversational corpora (Twitter and spoken corpora), showing that only some uses of tags carry over from speech to written conversation. We are currently working on both computational and formal linguistic models that capture this variability.

Regarding coreferential relations, the research literature yielded partly conflicting results, but it is generally accepted that their behavior differs between spoken and written language, for example in the length of referential chains, and the type of expression (pronoun or full noun phrase, for example) that is used. We extended this research to include social media conversations from Twitter, showing that coreferential relations on Twitter are more similar to spoken data than written. In the following, we have adapted a computational model for automatic coreference resolution to better capture the idiosyncrasies of social media conversations.

Finally, we are investigating the realization of coherence relations in different social media. In existing corpus research, it is often unclear if differences between corpora are due to confounds such as the topic of discourse, the authors/speakers included in the corpus, the language, the time of recording, etc. We address this by studying texts from two social media (Twitter and blogs) from the same authors and on similar topics. This allows us to pinpoint the effect of individual medium constraints such as the mode (spoken vs. written) or the text type (narrative vs. interactive) from individual stylistic variation and topic effects, and identify what stays stable wrt. coherence relation marking across all these dimensions.

Papers

Author(s)TitleYearPublished inLinks
Aktaş, B., Scheffler, T., & Stede, M.Anaphora Resolution for Twitter Conversations: An Exploratory Study.2018In M. Poesio, V. Ng, & M. Ogrodniczuk (Eds.), Proceedings of the First Workshop on Computational Models of Reference, Anaphora, and Coreference (pp. 1-10). New Orleans: Association for Computational Linguistics. * DOI: 10.18653/v1/W18-0701
Das, D., Scheffler, T., Bourgonje, P., & Stede, M.Constructing a Lexicon of English Discourse Connectives.2018In K. Komtani, D. Litman, K. Yu, A. Papangelis, L. Cavedon, & M. Nakano (Eds.), Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, pp. 360-365. *
Stede, M., Scheffler, T., & Mendes, A.Connective-Lex: A Web-Based Multilingual Lexical Resource for Connectives.2019Discours. Revue de linguistique, psycholinguistique et informatique, 24, 3-38 DOI: 10.4000/discours.10098
Aktaş, B., Scheffler, T., & Stede, M.Coreference in English OntoNotes: Properties and Genre Differences.2019In K. Ekštein (Ed.), Text, Speech, and Dialogue: Proceedings of the 22nd International Conference on Text, Speech and Dialogue (TSD 2019) (pp. 171-184): Springer International Publishing. * DOI: 10.1007/978-3-030-27947-9_15
Clausen, Y., & Nastase, V.Metaphors in Text Simplification: To change or not to change, that is the question.2019In H. Yannakoudakis, E. Kochmar, C. Leacock, N. Madnani, I. Pilán, & T. Zesch (Eds.), Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 423–434). Florence: Association for Computational Linguistics. *
Scheffler, T., Aktaş, B., Das, D., & Stede, M.Annotating Shallow Discourse Relations in Twitter Conversations.2019In A. Zeldes, D. Das, E. M. Galani, J. D. Antonio, & M. Iruskieta (Eds.), Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019 (pp. 50-55). Minneapolis, MN.: Association for Computational Linguistics. *
Aktaş, B., & Stede, M.Variation in Coreference Strategies across Genres and Production Media.2020In D. Scott, N. Bel, & C. Zong (Eds.), Proceedings of the 28th International Conference on Computational Linguistics (COLING) (pp. 5774-5785). Barcelona, Spain: International Committee on Computational Linguistics. *
Aktaş, B., & Kohnert, A.TwiConv: A Coreference-annotated Corpus of Twitter Conversations.2020In M. Ogrodniczuk, V. Ng, Y. Grishina, & S. Pradhan (Eds.), Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC@COLING) (pp. 47-54). Barcelona, Spain: Association for Computational Linguistics. *
Aktaş, B., Solopova, V., Kohnert, A., & Stede, M.Adapting Coreference Resolution to Twitter Conversations.2020T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 2454-2460): Association for Computational Linguistics. *
Bevacqua, L., & Scheffler, T.Form Variation of Pronominal It-Clefts in Written English.2020Linguistic Vanguard, 6(1), 20190066. DOI: 10.1515/lingvan-2019-0066
Clausen, Y., & Scheffler, T.Commitments in German Tag Questions: An Experimental Study.2020In S. Malamud, J. Pustejovski, & J. Ginzburg (eds.), Proceedings of the 24th Workshop on the Semantics and Pragmatics of Dialogue - Full Papers (SEMDIAL). *
Clausen, Y., & Scheffler, T.A corpus-based analysis of meaning variations in German tag questions. Evidence from spoken and written conversational corpora.2022Corpus Linguistics and Linguistic Theory, 18(1), 1-31. DOI: 10.1515/cllt-2019-0060
Aktaş, B., & Stede, M.Anaphoric Distance in Oral and Written Language: Experimental Evidence.2022Discours. Revue de linguistique, psycholinguistique et informatique, 31, 1-35. DOI: 10.4000/discours.12383
Aktaş, B., Clausen, Y., Scheffler, T., & Stede, M. Diskursstrategien in Sozialen Medien.2020K. Marx, H. Lobin, & A. Schmidt (Eds.), Deutsch in Sozialen Medien.  Interaktiv - multimodal - vielfältig (pp. 369–372). Boston, Berlin: de Gruyter.
Clausen, Y.You shall know a tag by the context it occurs in: An analysis of German tag questions and their responses in spontaneous conversations.2021In A. Holtz, I. Kovač, R. Puggaard-Rode, & J. Wall (Eds.), ConSOLE XXIX: Proceedings of the 29th Conference of the Student Organization of Linguistics in Europe (pp. 116-140). Leiden: Leiden University Centre for Linguistics. *
Aktaş, B.Variation in coreference patterns.2023PhD Thesis. Potsdam: Universitätsverlag Potsdam. DOI: 10.25932/publishup-59608
Clausen, Y., & Stede, M.Discourse connectives and their arguments: an experiment on anaphoricity in German.2022Linguistics Vanguard, 8(1), 95-111. DOI: 10.1515/lingvan-2021-0102

Talks

Author(s)TitleYearPublished inLinks
Aktaş, B., Scheffler, T., & Stede, M.Anaphora Resolution for Twitter Conversations: An Exploratory Study.2018Paper presented at the CRAC: NAACL Workshop on Computational Models of Reference, Anaphora, and Coreference, New Orleans, LA, USA. 06 June.
Clausen, Y., Scheffler, T. ,& Stede, M.Variability of German Question Tags.2018Paper presented at the Discourse-Pragmatic Variation and Change (DiPVaC4), University of Helsinki, Helsinki, Finland. 28 - 30 May.
Malamud, S. A., & Scheffler, T.Propositions, updates, speech acts - what is involved in “won’t you?” questions tags in American English.2018Paper presented at the 40. Jahrestagung der DGfS, Stuttgart. 07 - 09 March.
Clausen, Y., & Scheffler, T.Eine korpusbasierte Analyse von Bedeutungsvariation in Analyse von Bedeutungsvariation in deutschen Anhängsel -Fragen.2019Poster presented at the 55. Jahrestagung des Instituts für Deutsche Sprache (IDS), Mannheim, Germany. 12 - 14 March.
Scheffler, T., Stede, M., Aktaş, B., & Clausen, Y. Diskursvariabilität in sozialen Medien.2019Paper presented at the 55. Jahrestagung des Instituts für Deutsche Sprache (IDS), Mannheim, Germany. 13 March.
Stede, M.Granularity in coherence relations and in connective description: Empirical and practical considerations.2019Invited talk at the Fred Jelinek Seminar Series, Charles University, Prague, Czech Republic. 09 December.
Stede, M.From connectives to discourse relations - an analysis of CONTRAST.2019Invited talk at the Bucharest Discourse Workshop, University of Bucharest, Bucharest, Romania. 16 October.
Stede, M.Obwohl/Altough: Moving beyond concession.2019Paper presented at the XPrag Workshop “Contrasting Underspecification and Overspecification of Discourse relations”, Leibniz-Zentrum für Allgemeine Sprachwissenschaft (ZAS), Berlin, Germany. 25 - 26 September.
Aktaş, B., & Kohnert, A.TwiConv: A Coreference-annotated Corpus of Twitter Conversations.2020Paper presented at the Third Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC@COLING). Online. 12 Decemer.
Aktaş, B., Solopova, V., Kohnert, A., & Stede, M.Adapting Coreference Resolution to Twitter Conversations.2020Paper presented at the Workshop on Computational Approaches to Discourse (CODI@EMNLP). Online. 20 November.
Aktaş, B., & Stede, M.Variation in Coreference Strategies across Genres and Production Media.2020Paper presented at the 28th International Conference on Computational Linguistics (COLING). Online 11 December.
Clausen, Y.Modelling the variability of German tag questions in discourse.2020Paper presented at the 16. Sprachwissenschaftliche Tagung für Promotionsstudierende (STaPs16), University of Vienna, Vienna, Austria. 25 - 26 September.
Clausen, Y., & Scheffler, T.Commitments in German Tag Questions: An Experimental Study.2020Paper presented at the Virtual SemDial 24 (WatchDial). 19 July.
Scheffler, T.Discourse level variability in social media.2020Invited talk at the Colloquium series “Mehrsprachigkeit, Sprachkontakt, Sprachvariation”, Humboldt-Universität zu Berlin Berlin, Germany. 25 June.
Scheffler, T.Explicitness and implicitness of discourse relations across social media.2020Paper presented at the Workshop ''Explicit and implicit coherence relations: Different, but how exactly?'', Humboldt-Universität zu Berlin, Berlin, Germany. 17 - 18 January.
Clausen, Y.German tag questions in discourse: limits and variability.2021Paper presented at the 29th Conference of the Student Organisation of Linguistics in Europe (ConSOLE 29), Centre for Linguistics, Leiden University, Leiden, Belgium. 26 - 28 January.
Stede, M.Automatische Textgenerierung: Ein Blick auf die Technik.2021Invited talk at the Ringvorlesung: (Un)Creative Digital Writing, Technische Universität Dresden. 25 October.
Stede, M.Contrast in Discourse and in Argumentation.2021Invited talk at the Workshop: Integrating Perspectives on Discourse Annotation, Eberhard Karls Universität Tübingen. 04 - 05 October.

Contact

University of Potsdam
Department Linguistics
Prof. Dr. Doreen Georgi
Karl-Liebknecht-Strasse 24-25
House 14, Room 3.33
14476 Potsdam

(+49) 331 977-2968
doreen.georgi@uni-potsdam.de