We develop a reinforcement-learning algorithm to construct a feedback policy that delivers quantum-enhanced interferometric-phase estimation up to 100 photons in a noisy environment. We ensure scalability of the calculations by distributing the workload in a cluster and by vectorizing time-critical operations. We also improve running time by introducing accept-reject criteria to terminate calculation when a successful result is reached. Furthermore, we make the learning algorithm robust to noise by fine-tuning how the objective function is evaluated. The results show the importance and relevance of well-designed classical machine learning algorithms in quantum physics problems.