Hostname: page-component-745bb68f8f-cphqk Total loading time: 0 Render date: 2025-01-11T10:08:12.735Z Has data issue: false hasContentIssue false

Gradient estimation for smooth stopping criteria

Published online by Cambridge University Press:  15 June 2022

Bernd Heidergott*
Affiliation:
Vrije Universiteit Amsterdam
Yijie Peng*
Affiliation:
Peking University
*
*Postal address: Department of Operations Analytics, De Boelelaan 1105, 1081 HV Amsterdam. Email address: b.f.heidergott@vu.nl
**Postal address: Guanhua School of Management, 52 Haidian Rd, Beijing. Email address: pengyijie@pku.edu.cn

Abstract

We establish sufficient conditions for differentiability of the expected cost collected over a discrete-time Markov chain until it enters a given set. The parameter with respect to which differentiability is analysed may simultaneously affect the Markov chain and the set defining the stopping criterion. The general statements on differentiability lead to unbiased gradient estimators.

Type
Original Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Asmussen, S. et al. (2008). Asymptotic behavior of total times for jobs that must start over if a failure occurs. Math. Operat. Res. 33, 932944.CrossRefGoogle Scholar
Asmussen, S., Lipsky, L. and Thompson, S. (2014). Checkpointing in failure recovery in computing and data transmission. In Analytical and Stochastic Modelling Techniques and Applications (ASMTA 2014), eds B. Sericola, M. Telek and G. Horváth, Springer, Cham, pp. 253272.CrossRefGoogle Scholar
Avrachenkov, K., Piunovskiy, A. and Zhang, Y. (2015). Hitting times in Markov chains with restart and their application to network centrality. Methodology Comput. Appl. Prob. 20, 11731188.CrossRefGoogle Scholar
Bashyam, S. and Fu, M. (1994). Application of perturbation analysis to a class of periodic review (s, S) inventory systems. Naval Res. Logistics 41, 4780.Google Scholar
Bashyam, S. and Fu, M. (1998). Optimization of (s, S) inventory systems with random lead times and a service level constraint. Manag. Sci. 44, 243256.Google Scholar
Brown, L. et al. (2005). Statistical analysis of a telephone call center. J. Amer. Statist. Assoc. 100, 3650.CrossRefGoogle Scholar
Cao, X. (2007). Stochastic Learning and Optimization: a Sensitivity-Based Approach. Springer, New York.CrossRefGoogle Scholar
Caswell, H. (2013). Sensitivity analysis of discrete Markov chains via matrix calculus. Linear Algebra Appl. 438, 17271745.CrossRefGoogle Scholar
Caswell, H. (2019). Sensitivity Analysis: Matrix Methods in Demography and Ecology. Springer, Cham.CrossRefGoogle Scholar
Cohn, D. (1980). Measure Theory. Birkhäuser, Stuttgart.CrossRefGoogle Scholar
Dekker, R. et al. (1998). Maintenance of light-standards—a case-study. J. Operat. Res. Soc. 49, 132143.CrossRefGoogle Scholar
Fu, M. and Hu, J. Q. (1997). Conditional Monte Carlo: Gradient Estimation and Optimization Applications. Kluwer, Boston.CrossRefGoogle Scholar
Fu, M. C. (2006). Gradient estimation. In Handbooks in Operations Research and Management Science, Vol. 13, Simulation, eds S. Henderson and B. Nelson, North Holland, Amsterdam, pp. 575616.CrossRefGoogle Scholar
Glasserman, P. (1991). Gradient Estimation via Perturbation Analysis. Kluwer, Boston.Google Scholar
Glasserman, P. (2004). Monte Carlo Methods in Financial Engineering. Springer, New York.Google Scholar
Heidergott, B. (2001). A weak derivative approach to optimization of threshold parameters in a multi-component maintenance system. J. Appl. Prob. 38, 386406.CrossRefGoogle Scholar
Heidergott, B. (2001). Option pricing via Monte Carlo simulation: a weak derivative approach. Prob. Eng. Inf. Sci. 15, 335349.CrossRefGoogle Scholar
Heidergott, B. (2007). Max-Plus Linear Stochastic Models and Perturbation Analysis. Springer, New York.Google Scholar
Heidergott, B. and Farenhorst-Yuan, T. (2010). Gradient estimation for multicomponent maintenance systems with age-replacement policy. Operat. Res. 58, 706718.CrossRefGoogle Scholar
Heidergott, B., Hordijk, A. and Weisshaupt, H. (2008). Derivatives of Markov kernels and their Jordan decomposition. J. Appl. Anal. 14, 1326.CrossRefGoogle Scholar
Heidergott, B., Leahu, H. and Volk-Makarewicz, W. (2014) A smoothed perturbation analysis of Parisian options. IEEE Trans. Automatic Control 60, 469474.CrossRefGoogle Scholar
Heidergott, B. and Vázquez-Abad, F. (2006). Measure-valued differentiation for random horizon problems. Markov Process. Relat. Fields 12, 509536.Google Scholar
Heidergott, B. and Vázquez-Abad, F. (2008). Measure-valued differentiation for Markov chains. J. Optimization Theory Appl. 136, 187209.CrossRefGoogle Scholar
Ho, Y. C. and Cao, X. (1991). Perturbation Analysis of Discrete Event Dynamic Systems. Kluwer, Boston.CrossRefGoogle Scholar
Kallenberg, O. (2001). Foundations of Modern Probability, 2nd edn. Springer, New York.Google Scholar
Kartashov, N. (1996). Strong Stable Markov Chains. De Gruyter, Zeist.CrossRefGoogle Scholar
Kulkarni, G., Nicola, V. and Trivedi, S. (1987). The completion time of a job on multimode systems. Adv. Appl. Prob. 19, 932954.CrossRefGoogle Scholar
Law, A. and Kelton, D. (2000). Simulation Modeling and Analysis. McGraw-Hill, Boston.Google Scholar
Leahu, H. (2008). Measure-valued differentiations for finite products of measures. Doctoral Thesis, Vrije Universiteit Amsterdam.Google Scholar
L’Ecuyer, P. and Perron, G. (1994). On the convergence rates of IPA and FDC derivative estimators. Operat. Res. 42, 643656.CrossRefGoogle Scholar
Lyuu, Y.-D. and Teng, H.-W. (2011). Unbiased and efficient Greeks of financial options. Finance Stoch. 15, 141181.CrossRefGoogle Scholar
Peng, Y., Fu, M. C., Hu, J. Q. and Heidergott, B. (2018). A new unbiased stochastic derivative estimator for discontinuous sample performances with structural parameters. Operat. Res. 66, 487499.CrossRefGoogle Scholar
Pflug, G. (1992). Gradient estimates for the performance of Markov chains and discrete event processes. Ann. Operat. Res. 39, 173194.CrossRefGoogle Scholar
Pflug, G. (1996). Optimisation of Stochastic Models. Kluwer, Boston.CrossRefGoogle Scholar
Pflug, G. and Rubinstein, R. (2002). Inventory processes: quasi-regenerative property, performance evaluation, and sensitivity estimation via simulation. Stoch. Models 18, 469496.CrossRefGoogle Scholar
Rubinstein, R. (1992). Sensitivity analysis of discrete event systems by the ‘push out’ method. Ann. Operat. Res. 39, 229250.CrossRefGoogle Scholar
Rubinstein, R. and Shapiro, A. (1993). Discrete Event Systems: Sensitivity Analysis and Optimization by the Score Function Method. John Wiley, Chichester.Google Scholar
Rudin, W. (1964). Principles of Mathematical Analysis. McGraw-Hill, New York.Google Scholar
Rudin, W. (1987). Real and Complex Analysis. McGraw-Hill, New York.Google Scholar