OlmOCR: Open-source tool to extract plain text from PDFs