Last active
March 17, 2021 05:50
-
-
Save groupdocs-cloud-gists/aec775dc3a318dd54fe4314d477d1e83 to your computer and use it in GitHub Desktop.
Extract text from PDF documents programmatically using a REST API in Python.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Extract Text from PDF Documents | |
1. Programmatically upload a PDF file on the cloud | |
2. Extract Text from a PDF document programmatically using Python. | |
3. Download the Text file from the cloud. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
client_id = "112f0f38-9dae-42d5-b4fc-cc84ae644972" | |
client_secret = "16ad3fe0bdc39c910f57d2fd48a5d618" | |
configuration = groupdocs_parser_cloud.Configuration(client_id, client_secret) | |
configuration.api_base_url = "https://api.groupdocs.cloud" | |
my_storage = "" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# api initialization | |
parseApi = groupdocs_parser_cloud.ParseApi.from_config(configuration) | |
# define text options | |
options = groupdocs_parser_cloud.TextOptions() | |
options.file_info = groupdocs_parser_cloud.FileInfo() | |
options.file_info.file_path = "sample.pdf" | |
request = groupdocs_parser_cloud.TextRequest(options) | |
result = parseApi.text(request) | |
print("Text: " + result.text) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# api initialization | |
parseApi = groupdocs_parser_cloud.ParseApi.from_config(configuration) | |
# define text options | |
options = groupdocs_parser_cloud.TextOptions() | |
options.file_info = groupdocs_parser_cloud.FileInfo() | |
options.file_info.file_path = "sample.pdf" | |
options.start_page_number = 1 | |
options.count_pages_to_extract = 2 | |
request = groupdocs_parser_cloud.TextRequest(options) | |
result = parseApi.text(request) | |
for page in result.pages: | |
print("PageIndex: " + str(page.page_index) + ". Text: " + page.text) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# api initialization | |
parseApi = groupdocs_parser_cloud.ParseApi.from_config(configuration) | |
# define text options | |
options = groupdocs_parser_cloud.TextOptions() | |
options.file_info = groupdocs_parser_cloud.FileInfo() | |
options.file_info.file_path = "PDF_with_attachements.pdf" | |
options.file_info.password = "password" | |
container_info = groupdocs_parser_cloud.ContainerItemInfo() | |
container_info.relative_path = "template-document.pdf" | |
options.container_item_info = container_info | |
options.start_page_number = 2 | |
options.count_pages_to_extract = 1 | |
request = groupdocs_parser_cloud.TextRequest(options) | |
result = parseApi.text(request) | |
print("Text: " + result.pages[0].text) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# api initialization | |
file_api = groupdocs_parser_cloud.FileApi.from_config(configuration) | |
my_storage = "" | |
request = groupdocs_parser_cloud.UploadFileRequest("sample.pdf", "C:\\Files\\sample.pdf", my_storage) | |
response = file_api.upload_file(request) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment