Despite the growing impact of emissions on health and environment, there remains an unmet need for emission concentration prediction. The accumulating monitoring station and satellite data available makes the problem well suited for machine learning. This work formulates the spatiotemporal prediction of emission concentration as a machine learning task. To this end, an evaluation framework including baseline models and metrics of per-pixel loss and intersection over union accuracy, as well as a simple ConvLSTM model were developed. The model developed successfully generates one-hour ahead emission concentration forecasts with increasingly lower loss (6.5% and 30.5% less) and higher accuracy (18.4% and 18.6% higher) compared to the input-independent and random baseline models at the end of training. Crucially, compared to conventional solutions, our model developed generalizes to unseen emission sources with no significant decrease in accuracy.