.. _sec_rlsearcher: Demo RL Searcher ================ In this tutorial, we are going to compare RL searcher with random search in a simulation environment. A Toy Reward Space ------------------ Input Space ``x = [0: 99], y = [0: 99]``. The rewards are a combination of 2 gaussians as shown in the following figure: .. code:: python import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D Generate the simulation rewards as a mixture of 2 gaussians: .. code:: python def gaussian(x, y, x0, y0, xalpha, yalpha, A): return A * np.exp( -((x-x0)/xalpha)**2 -((y-y0)/yalpha)**2) x, y = np.linspace(0, 99, 100), np.linspace(0, 99, 100) X, Y = np.meshgrid(x, y) Z = np.zeros(X.shape) ps = [(20, 70, 35, 40, 1), (80, 40, 20, 20, 0.7)] for p in ps: Z += gaussian(X, Y, *p) Visualize the reward space: .. code:: python fig = plt.figure() ax = fig.gca(projection='3d') ax.plot_surface(X, Y, Z, cmap='plasma') ax.set_zlim(0,np.max(Z)+2) plt.show() .. parsed-literal:: :class: output :2: MatplotlibDeprecationWarning: Calling gca() with keyword arguments was deprecated in Matplotlib 3.4. Starting two minor releases later, gca() will take no keyword arguments. The gca() function should only be used to get the current axes, or if no axes exist, create new axes with default keyword arguments. To create a new axes with non-default arguments, use plt.axes() or plt.subplot(). ax = fig.gca(projection='3d') Simulation Experiment --------------------- Customize Train Function ~~~~~~~~~~~~~~~~~~~~~~~~ We can define any function with a decorator ``@ag.args``, which converts the function to AutoGluon searchable. The ``reporter`` is used to communicate with AutoGluon search algorithms. .. code:: python import autogluon.core as ag @ag.args( x=ag.space.Categorical(*list(range(100))), y=ag.space.Categorical(*list(range(100))), ) def rl_simulation(args, reporter): x, y = args.x, args.y reporter(accuracy=Z[y][x]) Random Search Baseline ~~~~~~~~~~~~~~~~~~~~~~ .. code:: python random_scheduler = ag.scheduler.FIFOScheduler(rl_simulation, resource={'num_cpus': 1, 'num_gpus': 0}, num_trials=300, reward_attr='accuracy') random_scheduler.run() random_scheduler.join_jobs() print('Best config: {}, best reward: {}'.format(random_scheduler.get_best_config(), random_scheduler.get_best_reward())) .. parsed-literal:: :class: output 0%| | 0/300 [00:00, ]