- weakly| communicating mdps🔍
- Span|Based Optimal Sample Complexity for Weakly Communicating ...🔍
- Learning Unknown Markov Decision Processes🔍
- REGAL Revisited🔍
- Weakly Coupled Markov Decision Processes with Imperfect ...🔍
- Communicating MDPs🔍
- Classification Problems in MDPs🔍
- Reinforcement Learning for Weakly|Coupled MDPs and an ...🔍
weakly| communicating mdps
weakly- communicating mdps - OpenReview
Weakly-communicating MDPs are the most general class of MDPs that a learning algorithm with a single stream of experience can guarantee obtaining a policy ...
Span-Based Optimal Sample Complexity for Weakly Communicating ...
We study the sample complexity of learning an \varepsilon-optimal policy in an average-reward Markov decision process (MDP) under a generative model.
REGAL: A Regularization based Algorithm for Reinforcement ... - arXiv
We provide an algorithm that achieves the optimal regret rate in an unknown weakly communicating Markov Decision Process (MDP).
Learning Unknown Markov Decision Processes: A Thompson ...
To have meaningful finite time regret bounds, we consider the subclass of weakly communicating. MDPs defined as follows. Definition 1. An MDP is weakly ...
REGAL Revisited: Regularized Reinforcement Learning for Weakly ...
By studying MDPs, we can develop techniques and algo- rithms which can be applied to many real-world problems where an agent is required to act intelligently.
Weakly Coupled Markov Decision Processes with Imperfect ...
This paper considers two classes of weakly coupled MDPs with imperfect information. In the first case, the transition probabilities for each sub-MDP are ...
REGAL: A Regularization based Algorithm for Reinforcement ... - TTIC
Thus, we have proved the result for all weakly com- municating MDPs. We can now derive the fact that sp(h?) ≤ Dow. Corollary 5. For any weakly communicating MDP ...
Communicating MDPs: Equivalence and LP properties - ScienceDirect
Abstract. It is shown that the communicating property of Markov Decision Processes (MDPs) is equivalent to satisfaction of sets of linear equations. A mapping ...
Classification Problems in MDPs - SpringerLink
These MDPs can be classified in several ways. One way is based on the concept communicating, and distinguishes between communicating, weakly communicating and..
Span-Based Optimal Sample Complexity for Weakly Communicating ...
... (MDP) under a generative model. For weakly communicating MDPs, we establish the complexity bound $\widetilde{O}\left(SA\frac{\mathsf{H} ...
Reinforcement Learning for Weakly-Coupled MDPs and an ...
In this paper, we focus on reinforcement learning for weakly-coupled MDPs. A weakly-coupled MDP is an MDP that has a natural decomposition into a set of ...
REGAL Revisited: Regularized Reinforcement Learning for Weakly ...
Abstract. Markov Decision Processes (MDPs) are used extensively in artificial intelligence and reinforcement learning to describe how an agent interacts ...
Learning Unknown Markov Decision Processes: A Thompson ...
Abstract. We consider the problem of learning an unknown Markov Decision Process (MDP) that is weakly communicating in the infinite horizon setting. We propose ...
Reinforcement Learning for Weakly-Coupled MDPs and an ...
For experimentation we use a problem from autonomous planetary rover control that can be modeled as a weakly- coupled MDP. In our decision-making scenario, a ...
Solving Very Large Weakly Coupled Markov Decision Processes
We can therefore view each task as an MDP. How- ever, these MDPs are weakly coupled by resource constraints: actions selected for one MDP restrict the ...
Logarithmic regret in communicating MDPs: Leveraging ... - HAL
Abstract. We study regret minimization in an average-reward and communicating Markov Decision. Process (MDP) with known dynamics, ...
CLASSIFICATION PROBLEMS IN MDPS
(ii) the concepts of a unichain and weakly communicating Markov chain coincide and correspond to m = 1;. Page 4. 154. MARKOV PROCESSES AND CONTROLLED MARKOV ...
REGAL: a regularization based algorithm for reinforcement learning ...
We provide an algorithm that achieves the optimal regret rate in an unknown weakly communicating Markov Decision Process (MDP).
Self-Adapting Network Relaxations for Weakly Coupled Markov ...
... weakly coupled Markov decision processes (WDPs) arise in dynamic decision making and reinforcement learning, decomposing into smaller MDPs when.
Reduction of total-cost and average-cost MDPs with weakly ...
This paper describes conditions under which undiscounted MDPs with infinite state spaces and weakly continuous transition kernels can be transformed into ...