TITLE : Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent AUTHOR(S) : Bartocci E, Bortolussi L, Brazdil T, Milios D, Sanguinetti G TYPE : Conference article YEAR : 2016 CODE : 424323 *** DO NOT EDIT THIS FILE ***