The Defence Science and Technology Laboratory had a requirement to develop a gold standard dataset on which to train and evaluate automated text extraction algorithms. Aleph Insights produced this dataset reaching a higher standard of annotation confidence for a lower cost than had been considered achievable by the customer. This dataset is now being successfully used to develop the next generation of text extraction technologies.