Tsinghua University – University of Amsterdam Joint Research Centre for Logic
Tsinghua University – University of Amsterdam Joint Research Centre for Logic

Events

[Advances in Logic and Artificial Intelligence] 29th May, 2026:

Speaker: 郑淳元 Chunyuan Zheng (Peking University)

Time: 9:50-12:15, 29 May 2026

Location: 清华大学第三教学楼 3104 教室

Abstract:

大语言模型正在深刻改变知识获取、信息生产与复杂决策支持的方式,为推动产业升级与社会智能化发展带来了前所未有的机遇。然而,现有大模型的预训练范式为最小化下一个词元的预测损失,本质上在学习数据中的统计共现关系,虽然在知识问答、内容生成等任务中表现优异,但在面对复杂现实问题时,仍普遍存在幻觉、逻辑不一致、偏见传播以及安全性不足等挑战,严重制约了其可靠部署与广泛应用。
与此同时,大语言模型领域的最新进展表明,大模型若要进一步迈向更高水平的智能,必须增强对因果关系的显式建模能力。在本次报告中,我将重点介绍三个方面的研究:首先介绍因果的基本思想与核心问题;其次讨论因果与逻辑方法如何赋能大语言模型;最后探讨大语言模型如何反向赋能因果与逻辑研究。通过探索因果、逻辑与大语言模型的双向赋能关系,为构建更加可靠的人工智能提供新的理论基础与方法支撑。

=====

Speaker Bio: 郑淳元,北京大学数学科学学院博士研究生,研究方向为可信机器学习、因果推断、大语言模型推理。在CCF-A类会议,例如人工智能三大顶会ICML、NeurIPS、ICLR发表论文多篇。担任NeurIPS领域主席(Area Chair),以及ICML、ICLR、KDD、WWW等顶会程序委员会成员,AAAI 2025因果方向Workshop共同主席,ICLR 2026 大语言模型逻辑推理Workshop学生助理。个人主页:https://chunyuanzheng.github.io/。

[Advances in Logic and Artificial Intelligence] 15th May, 2026:

Speaker: 成凤祥 Fengxiang Cheng (University of Amsterdam)

Time: 9:50-12:15, 15 May 2026

Location: 清华大学第三教学楼 3104 教室

Abstract:

大语言模型尽管在多种任务上表现出优异性能,但其在复杂场景中的逻辑推理能力仍存在显著不足,尤其难以在具体推理任务中持续给出准确答案,并保持不同回答之间的逻辑一致性。这一局限在需要严格推理与可靠决策的真实场景中,显著制约了大语言模型的实际应用价值。在本次报告中,我们将聚焦于大语言模型复杂逻辑推理能力的测评与提升方法。具体而言,注意到不同符号语言表示的不同优势、符号与自然语言推理的互补性,我们将介绍一种提升大模型符号翻译与推理能力的多智能体辩论框架 MAD-Logic。另一方面,观察到现有方法局限于显式提供所有前提的封闭世界,无法处理真实场景中信息不完整、常识缺失的开放世界问题,我们将介绍能主动识别并补全缺失前提的智能符号推理算法 OpenIKLR。此外,逻辑一致性要求对于不同问题大模型的回答之间互不矛盾,符合逻辑推理规则,我们将介绍首个评估大模型复杂逻辑一致性的评测基准 LogiConBench,有效解决了现有逻辑推理评测基准缺乏可扩展性、多样性且挑战性不足的问题。

=====

Speaker Bio: 成凤祥,阿姆斯特丹大学逻辑、语言与计算研究所(ILLC)二年级博士生,清华大学逻辑学硕士。主要研究方向是大模型逻辑推理、因果推断与自然语言处理。以第一作者在 IJCAI、AAAI、ACL、EMNLP、Topoi 等顶级会议或期刊上发表多篇论文。 曾在 IJCAI、AAAI 大会作 Tutorial 报告,并共同组织 ICLR、AAAI 大模型逻辑推理主题研讨会。个人主页:https://fengxiang-cheng.github.io

[Advances in Logic and Artificial Intelligence] 8th May, 2026:

Speaker: Mitra Baratchi (University of Leiden)

Time: 9:50-12:15, 8 May 2026

Location: 清华大学第三教学楼 3104 教室

Abstract:

Modern sensing technologies have provided the possibility of sensing the world in a way that has not been possible before, generating massive spatio-temporal data sources. How can we use such data to understand and even change the complex world around us for the better? In this talk, I will discuss unique machine learning challenges in transforming such data into actionable decisions. These challenges call for automated solutions to address various problems, from filling the gaps in the data to filling the gaps in the knowledge acquired from data alone. I will present a few examples of such problems and automated solutions to address them.

=====

Speaker Bio: Mitra Baratchi leads the Spatio-temporal data Analysis and Reasoning (STAR) research group and is a member of the interdisciplinary research programme Society, Artificial Intelligence and Life Sciences (SAILS). Her research interest lies in spatio-temporal, time-series, and mobility data modelling. Specifically, she designs algorithms that extract patterns from such data in a fully automated manner. Her research targets applications in a broad range of urban, environmental, and industrial domains for which she has collaborations notably with the European Space Agency, Honda Research Institute, various municipalities, and researchers in other scientific disciplines.

[Advances in Logic and Artificial Intelligence] 17th April, 2026:

Speaker: Jundong Xu (National University of Singapore)

Time: 9:50-12:15, 17 April 2026

Location: 清华大学第三教学楼 3104 教室

Abstract:

大语言模型虽然展现出强大的推理能力,但其逻辑推理仍然存在显著的不可靠性,例如对提示敏感、易受干扰以及缺乏一致性,这限制了其在高可靠性场景中的应用。
本报告围绕“如何实现可靠的逻辑推理”这一核心问题,介绍一系列神经符号方法的演进路径:从 SymbCoT 将符号推理融入语言生成过程,到 Aristotle 引入可回溯的多路径推理框架,再到 LogicReward 通过学习机制对齐逻辑正确性,逐步提升模型推理的可靠性。最后,我们进一步将该范式扩展到多模态场景,介绍 MuSLR 在复杂输入下的推理能力。
整体上,这些工作展示了一条从提示级方法到结构化推理再到学习驱动优化的统一路径,为构建可靠的大模型推理系统提供了新的思路。

=====

Speaker Bio: 徐俊东(https://aiden0526.github.io/),新加坡国立大学计算机系博士一年级,主要研究方向是大模型的推理能力,包括大模型的严谨逻辑推理,符号推理。他致力于探索如何能让大模型的推理过程更可信和可验证。他作为第一作者在 Neuro-Symbolic 领域的多项研究成果已发表于 NeurIPS、ICLR、ACL、AAAI 等人工智能顶级会议,并获得 AAAI 2026 Symbolic and Logical Reasoning Workshop Best Paper Award。

[Advances in Logic and Artificial Intelligence] 26th March, 2026:

Resolution Chain-of-Thought for LLM Symbolic Reasoning

Speaker: Yixiang Chen (East China Normal University)

Time: 16:00-17:30, 26 March 2026

Abstract:

Large language models still struggle with complex logical reasoning. Numerous studies have explored ways to strengthen their inference skills, broadly grouped into solver-based, prompt-based, and fine-tuning approaches. Among these, prompting techniques improve LLMs by explicitly modeling reasoning chains like Chain-of-Thought (CoT), Tree-of-Thought (ToT), by acquiring symbolic expressions such as SymbCoT and by adaptive selection of Symbolic Languages (SL).
Building on this line of work, we introduce bidirectional reasoning into the improved method and implement an automated reasoning process based on the generation of large language models through the design of prompt words. Technically, Bi-Resolution first converts the natural language problem into first-order logic formulation, and selects the corresponding version of resolution algorithm. During resolution, bidirectional reasoning guides constraint instantiation to prune redundant clauses and reduce complexity. Bi-Resolution enables the model to judge statements that are “neither fully true nor fully false” more accurately. Experimental results show that our method successfully improves the logical inference accuracy of large language models.

=====

Speaker Bio: Professor Yixiang Chen, a professor at the School of Software Engineering, East China Normal University, currently serves as the first Chair of the Artificial Intelligence Logic Committee of the Chinese Association for Artificial Intelligence, the first Chair of the Trusted Intelligent Systems Committee of the Shanghai Association for Artificial Intelligence. He is engaged in foundational and engineering research on the trustworthiness of artificial intelligence. He has established the spatio-temporally consistent intelligent system specification language STeC and its hybrid clock logic system, designed technical methods for the optimized hardware and software design of intelligent systems, and developed multidimensional attribute-based software trustworthiness measurement, evaluation methods, and enhancement specifications.

[Advances in Logic and Artificial Intelligence, lectures] 26th February, 6th March, 13th March, 2026:

Probabilistic Causal Models

Speaker: Hanti Lin

Time: 9:50-12:15, 27 February 2026

Algorithms for Causal Learning

Speaker: Hanti Lin

Time: 9:50-12:15, 6 March 2026

Probabilistic Causal Models

Speaker: Hanti Lin

Time: 9:50-12:15, 13 March 2026

=====

Speaker Bio: Hanti Lin is a philosopher of science and formal epistemologist, with papers published in philosophy as well as theoretical computer science. Before he joined UC Davis, he was a postdoc at the Australian National University.

[Advances in Logic and Artificial Intelligence] 18th September, 2025:

Towards Logical and Causal Reasoning of Large Language Models

Speaker: Haoxuan Li (Peking University)

Time: 16:00-17:30, 18 September 2025

Abstract:

Large language models (LLMs) have achieved remarkable successes in various natural language tasks, but still have significant limitations to their logical and causal reasoning abilities. In this talk, we first comprehensively introduce the most cutting-edge LLM logical reasoning approaches with a proposed new taxonomy. Specifically, to accurately answer complex logic questions, previous methods can be categorized based on reliance on external solvers, prompts, and fine-tuning. To avoid logical contradictions, we discuss concepts and solutions of various logical consistencies, including implication, negation, transitivity, factuality consistencies, and their composites. Secondly, we discuss the benefits of introducing causality into LLM reasoning, in which the key insight is that correlation does not necessarily imply causation. For example, there is high ice cream sales and crime rates in summer, but this does not indicate that ice cream sales have a causal influence on crime rates. We conclude that logical rules can be regarded as the causal invariance of LLM reasoning based on natural language examples.

=====

Speaker Bio: Haoxuan Li is an assistant researcher at Peking University, also as research fellow at Tsinghua-UvA Joint Research Center for Logic and the University of Oxford. He graduated from the experimental class for gifted children in Beijing No.8 Middle School, which enables him to pursue his PhD at the age of 19. His research interests include causal inference and logical reasoning of large language models, and has more than 50 publications as the first author or the corresponding author appeared in top-tier CCF-A conferences, reported by MIT Technology Review and CAAI. Moreover, he is supported by the Young Scientists Fund of the National Natural Science Foundation of China (¥300,000) and Young Elite Scientists Sponsorship Program by CAST – Doctoral Student Special Plan (via CCF). He has been selected as the 2024 Peking University Person of the Year and representative of National Scholarship reported by People’s Daily.

[TALK] 18th May, 2025:

Developing And Assessing Language Models For Logical Reasoning Over Natural Language

Speaker: Qiming Bao (University of Auckland)

Time: 10:00 AM, 18 May 2025

Abstract: Recent advancements in AI have highlighted the importance of integrating deep learning with symbolic logic reasoning. Language models such as RoBERTa, DeBERTa, LLaMA, Alpaca, Vicuna, GPT-3.5, and GPT-4 have advanced the performance of AI systems in various natural language processing tasks to human-like levels. However, the generalization of language models in logical reasoning remains underexplored. One of the main reasons is the limitation posed by the lack of extensive, balanced, and real-world datasets for logical reasoning. This presentation has three research objectives, addressing the main research gap/limitation:
To improve the models’ out-of-distribution performance on multi-step logical reasoning tasks through logic-driven data augmentation.
To enhance the models’ performance on real-world logical reasoning datasets by constructing an Abstract Meaning Representation based logic-driven data augmentation method.
Although large language models demonstrate impressive performance on current logical reasoning leaderboards, it remains underexplored whether they truly possess strong capabilities in logical reasoning.
The first part of the presentation focuses on improving language models’ ability in multi-step logical reasoning, particularly when faced with unbalanced reasoning steps. Inspired by DeepLogic, we present IMA-GloVe-GA, an RNN-based model with a gate attention mechanism, developed to accommodate varying reasoning depths. This is facilitated by our PARARULE-Plus dataset, created for deeper reasoning tasks. Our results show notable enhancements in model performance under both standard and out-of-distribution conditions.
The second part of the presentation focuses on generating diverse training data to address the scarcity of real-world logical reasoning datasets and enhance large language models (LLMs) for logical reasoning tasks. We introduce AMR-LDA, a data augmentation method that converts text into Abstract Meaning Representation (AMR) graphs, improving reasoning datasets. This approach benefits various models, including GPT-3.5 and GPT-4, and improves performance, notably achieving the top rank on the ReClor leaderboard.
The third part of the presentation examines how Large Language Models (LLMs) like GPT-3.5 and GPT-4 respond to trivial changes in logical reasoning datasets. We created ReClor-plus, LogiQA-plus, and LogiQAv2-plus, which include shuffled options and modified correct choices to test LLMs’ logical reasoning. Although LLMs excel on standard datasets, they exhibit degraded performance with these modified versions. Our findings reveal that incorporating task variations, perturbations in training sets, and logic-driven data augmentation significantly enhances LLMs’ generalisation and robustness in logical reasoning scenarios.
This presentation explores several different approaches to demonstrate a more robust QA system that aids computers in thinking and reasoning over natural language texts through logical reasoning. Our methods have been evaluated and now lead the public logical reasoning leaderboard, ReClor. We are the first group in the world to have scored above 90% on the ReClor hidden test set.

About the speaker: Qiming Bao is a Ph.D. graduated from the Strong AI Lab, NAOInstitute, University of Auckland, New Zealand, supervised by Professor Michael Witbrock and Associate Professor Jiamou Liu. His research interests include natural language processing and reasoning. He has over five years of research and development experience, and has published several papers in top conferences in the fields of AI/NLP/Reasoning, including ACL, AAAI, IJCAI, ICLR, EACL, LLM@IJCAI, AGI@ICLR and IJCLR-NeSy. His method named AMR-LDA (GPT-4 + AMR-LDA Prompt Augmentation) has achieved the #1 ranking on a one of the most challenged logical reasoning reading comprehension leaderboards (ReClor) and we are the first group scored above 90% on the hidden test set around the world. Two of his logical reasoning datasets called PARARULE-Plus and AbductionRules have been collected by LogiTorch, ReasoningNLP, Prompt4ReasoningPapers, OpenAI/Evals, A Survey on Evaluation of Large Language Models and Reasoning Language Models: A Blueprint. Qiming has given public guest talks and academic visit at Microsoft Research Asia, Samsung AI Center Cambridge UK, IEEE Vehicular Technology Society, ZJU-NLP Group, Zhejiang University, The University of Melbourne, Institute of Automation, Chinese Academy of Sciences, Shenzhen MSU-BIT University, University of Massachusetts – Amherst and Penn State University on his main research topic, “Natural Language Processing and Reasoning”.