Tomasz Korbak's research works

UK AI Safety Institute - ‪‪Cited by 1666‬‬ - ‪language models‬ - ‪AI safety‬ - ‪reinforcement learning‬ - ‪Bayesian inference‬ - ‪LLM agents‬

Tomasz Korbak's research works | University of Sussex and other ...

Tomasz Korbak's 28 research works with 68 citations, including: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.

Tomek Korbak — personal homepage

I'm a Senior Research Scientist at the UK AI Safety Institute working with Geoffrey Irving on safety cases for frontier models.

Tomasz Korbak | Semantic Scholar

Semantic Scholar profile for Tomasz Korbak, with 50 highly influential citations and 21 scientific research papers.

Papers - Tomek Korbak

Korbak, T. (2022). Self-organisation, (M, R)–systems and enactive cognitive science. Adaptive Behavior. Korbak, ...

Tomasz Korbak | Papers With Code

The availability of large pre-trained models is changing the landscape of Machine Learning research and practice, moving from a "training from scratch" to a " ...

[2302.08582] Pretraining Language Models with Human Preferences

Pretraining Language Models with Human Preferences. Authors:Tomasz Korbak, Kejian Shi, Angelica Chen, Rasika Bhalerao, Christopher L. Buckley, ...

Tomek Korbak's research works - ResearchGate

Tomek Korbak's scientific contributions. Ad. What is this page? This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the ...

Tomasz Korbak (0000-0002-6258-2013) - ORCID

Works (5) ; Self-organisation, (M, R)–systems and enactive cognitive science. Adaptive Behavior. 2023-02 | Journal article ; Enough blanket metaphysics, time for ...

Tomasz Korbak - CatalyzeX

View Tomasz Korbak's papers and open-source code. See more researchers and engineers like Tomasz Korbak.

Tomasz Korbak - DBLP

Tomasz Korbak : Self-organisation, (M, R)-systems and enactive cognitive science.

On Reinforcement Learning and Distribution Matching for Fine ...

On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting. Authors:Tomasz Korbak, Hady ...

Tomek Korbak (@tomekkorbak) / X

Tomek Korbak's posts. Pinned ... research and work towards shared policies, standards, and guidance to tackle ...

Tomek Korbak - LessWrong

0. Message. Dialogue. Subscribe. Senior Research Scientist at UK AISI working on frontier AI safety cases ... Cool work! Reminds me a bit of my submission to ...

Self-organisation, (M, R)–systems and enactive cognitive science

Tomasz Korbak is a PhD student at the Department of Informatics, University ... Theory Building for Causal Inference: EITM Research Projects. Show ...

Controlling Conditional Language Models without Catastrophic ...

Controlling Conditional Language Models without Catastrophic Forgetting. Tomasz Korbak, Hady Elsahar, German Kruszewski, Marc Dymetman. Proceedings of the 39th ...

Tomek Korbak - AI Alignment Forum

Tomek Korbak's profile on the AI Alignment Forum — A community blog devoted to technical AI alignment research.

Tomasz Korbak - FAR.AI

FAR.AI works to ensure AI systems are trustworthy and beneficial to society ... Tomasz Korbak. PhD Student. University of Sussex. Tomas is a PhD student at ...

Controlling Conditional Language Models without Catastrophic ...

*Work done during an internship at Naver Labs Europe. 1University of Sussex 2Naver Labs Europe. Correspondence to: Tomasz Korbak .

Papers by Tomasz Korbak - AIModels.fyi

Papers authored by Tomasz Korbak. ... This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs) ...