Wyniki wyszukiwania - BazTech

1

Bellman equations for terminal utility maximization with general bid and ask prices

Rogala T., Stettner Ł.

Probability and Mathematical Statistics

|

2018

|

Vol. 38, Fasc. 1

139--155

EN

In the paper we solve a system of Bellman equations for finite horizon continuous time terminal utility maximization problem with general càdlàg bid and ask prices.We assume that we have a restricted number of transactions at time moments we choose. The main result of the paper says that we can find a regular version of solutions to the system of Bellman equations, which enables us to find the form of nearly optimal strategies.

2

The Bruss-Robertson Inequality: Elaborations, Extensions, and Applications

Steele J. M.

Mathematica Applicanda

|

2016

|

Vol. 44, No. 1

3--16

EN

The Bruss-Robertson inequality gives a bound on the maximal number of elements of a random sample whose sum is less than a specified value. The extension of that inequality which is given here neither requires the independence of the summands nor requires the equality of their marginal distributions. A review is also given of the applications of the Bruss-Robertson inequality, especially the applications to problems of combinatorial optimization such as the sequential knapsack problem and the sequential monotone subsequence selection problem.

PL

Nierówność Bruss-Robertson szacuje maksymalną liczbę elementów w próbie, której suma jest ograniczona przez zadaną liczbę. Uogólnienia tej nierówności podane w tej pracy nie wymagają założenia niezależności składników sumy ani tego, by były o tym samym rozkładzie. Podano także przegląd zastosowań nierówności Brussa-Robertsona, a zwłaszcza zastosowania do problemów kombinatorycznych, takich jak sekwencyjny problem upakowania i wybór monotonicznego podciągu.

3

The machine learning approach: analysis of experimental results

Poliscuk J. E.

Journal of Applied Computer Science

|

2003

|

Vol. 11, nr 1

61-76

EN

The article analyses a reinforcement learning method in which the subject of learning is defined. The essence of this method is the selection of activities by a try and fail process and awarding deferred rewards. Theoretical analyses were supplemented by the practical studies, with reference to implementation of the Sarsa( Lambda) algorithm, with replacing eligibility traces and the Epsilon greedy policy.

4

Adaptive Machine Reinforcement Learning

Poliscuk J.

Schedae Informaticae

|

2002

|

Vol. 11

57-74

EN

In this article is defined a reinforcement learning method, in which a subject of learning is analyzed. The essence of this method is the selection of activities by a try and fail process and awarding deferred rewards. If an environment is characterized by the Markov property, then step-by-step dynamics will enable forecasting of subsequent conditions and awarding subsequent rewards on the basis of the present known conditions and actions, relatively to the Markov decision making process. The relationship between the present conditions and values and the potential future conditions is defined by the Bellman equation. The article discusses also a method of temporal difference learning, mechanism of eligibility traces, as well as their algorithms TD(0) and TD(Lambda). Theoretical analyses were supplemented by the practical studies, with reference to all implementation of the Sarsa(Lambda) algorithm, with replacing eligibility traces and the Epsilon greedy policy.

5

Discrete time portfolio selection with proportional transaction costs

Bobryk R. V., Stettner Ł.

Probability and Mathematical Statistics

|

1999

|

Vol. 19, Fasc. 2

235--248

EN

In the paper discrete time portfolio selection with maximization of a discounted satisfaction functional is studied. In Section 2 the case without transaction costs is considered and explicit solutions for special satisfaction functions are given. In Section 3 the problem with proportional transaction costs is investigated and optimal strategies are characterized.