Reading PDF files using QTP (9735 Views)
Reply
Occasional Contributor
ChandraGunda
Posts: 4
Registered: ‎09-07-2010
Message 1 of 5 (9,735 Views)

Reading PDF files using QTP

Hi

 

I have pdf files with data framed in Table format. I want to read this table ( column names and corresponding row data) and put all this data into spreadsheet or DataTable using QTP. Can anyone please suggest me what is the best approach I can do with QTP?

 

Thanks in advance for your reply or suggestions in this regard.

 

Please see the sample pdf file screenshot in attachments.

 

Thanks,

Chandra

Please use plain text.
Advisor
Mr_Spock
Posts: 20
Registered: ‎07-26-2010
Message 2 of 5 (9,716 Views)

Re: Reading PDF files using QTP

The following sample code demonstrates the use of the Acrobat API to get text from a PDF file. Please note this is provided “as is” bases, is not part of QuickTest Professional and is not supported by HP.
 
Option Explicit
Dim acroApp, acroAVDoc, acroPDDoc, acroRect, PDTextSelect
Dim gPDFPath, nElem, pageNo
pageNo = 0 'first page on a PDF file
 
gPDFPath ="C:\UFTLog.pdf"
' ** Initialize Acrobat by creating App object
Set acroApp = CreateObject ("AcroExch.App")
' ** show acrobatacroApp.Show()' ** Set AVDoc object
Set acroAVDoc = CreateObject("AcroExch.AVDoc")' ** open the PDF
If acroAVDoc.Open( gPDFPath, "Accessing PDF's") Then
  If acroAVDoc.IsValid = False Then ExitTest()
  acroAVDoc.BringToFront()
  Call acroAVDoc.Maximize(True)
  Print"Current pdf title ---> "& acroAVDoc.GetTitle()
  Set acroPDDoc = acroAVDoc.GetPDDoc()
  Print"File Name ---> "& acroPDDoc.GetFileName()
  Print"Number of Pages ---> "& acroPDDoc.GetNumPages()
  Set acroRect = CreateObject("AcroExch.Rect")
  acroRect.Top = 380
  acroRect.Left = 400
  acroRect.Bottom = 100
        acroRect.Right = 500
  ' ** Selecting page 42 ( index is 43)
  Set PDTextSelect = acroPDDoc.CreateTextSelect( pageNo, acroRect )
  If PDTextSelect Is Nothing Then
   Print"Unable to Create TextSelect object."
      ExitTest()
  End If
 
  Call acroAVDoc.SetTextSelection( PDTextSelect )
  Call acroAVDoc.ShowTextSelect()
  Print"Selection Page Number ---> " & PDTextSelect.GetPage()
  Print"Selection Text Elements ---> "& PDTextSelect.GetNumText()
  ' ** Looping through text elements
  For nElem = 0 To PDTextSelect.GetNumText() - 1
   Print"Text # "& nElem &" ---> '"& PDTextSelect.GetText( nElem ) &"'"
  Next
  '  ** Destroying Text Selection
  Call PDTextSelect.Destroy()
 End If
 AcroApp.CloseAllDocs()
 AcroApp.Exit()
Set PDTextSelect = Nothing : Set acroRect = Nothing
Set AcroApp =  Nothing: Set AcroAVDoc =  Nothing

Important Note:
 
The Microsoft Windows version of Acrobat is an OLE Automation server. In order to use the OLE objects made available by Acrobat API, the system must have the full Acrobat product installed. If the system only has the Acrobat Reader installed this will not work and errors indicating that the ActiveX component cannot be created will be shown.
Please use plain text.
Advisor
Mr_Spock
Posts: 20
Registered: ‎07-26-2010
Message 3 of 5 (9,715 Views)

Re: Reading PDF files using QTP

[ Edited ]

Important Note:

QuickTest Professional does not have special support for working with Acrobat or .PDF files. The content of this article is provided on an "as is" basis and is not part of QuickTest Professional. It is not guaranteed to work and is not supported by HP Customer Support. The user is responsible for any and all modifications that may be required.

Please be aware that the steps needed to capture text may change with different versions of Adobe Acrobat Reader.

Acrobat Reader 6.0
QuickTest Professional will not be able to capture the text from the Acrobat window using its built in functionality, but it should still be possible to get the text.

1. Enable text selection in Acrobat Reader.
2. Select the text you wish to capture.
3. Copy the text to the system clipboard.
4. Use the Clipboard object to retrieve the text.

5. Once the text is in a variable, you can use VBScript string functions (e.g., InStr, Left, Right, Mid, Split) to parse through the string and get information out of it. For more information on parsing out a sub-string,


Example:
' This function enables Text Selection in Acrobat Reader 6.0
Public Function AcrobatEnableTextSelection()
    ' Press Alt+T, S, X to enable Text Selection in Acrobat Reader
    Window("regexpwndtitle:=Adobe Reader","regexpwndclass:=AdobeAcrobat").Activate
    Window("regexpwndtitle:=Adobe Reader","regexpwndclass:=AdobeAcrobat").Type micAltDwn + "t" + micAltUp
    Window("regexpwndtitle:=Adobe Reader","regexpwndclass:=AdobeAcrobat").Type "s"
    Window("regexpwndtitle:=Adobe Reader","regexpwndclass:=AdobeAcrobat").Type "x"
    wait 0, 500
End Function

' This function copies the selected text to the system clipboard
Public Function AcrobatCopy(obj)
    ' Copy the selected text to the clipboard
    obj.Type micCtrlDwn + "c" + micCtrlUp
End Function

' This function selects all the text in the PDF file
Public Function AcrobatSelectAll(obj)
    obj.Click
    obj.Type micCtrlDwn + "a" + micCtrlUp
End Function

' Selects the text in the specified coordinates. NOTE: The coordinates are relative to the object, not the screen.
Public Function AcrobatSelectPartial(obj, x1, y1, x2, y2)

    ' Calculate the screen coordinates for the text
    ax = obj.GetROProperty("abs_x")
    ay = obj.GetROProperty("abs_y")

    sx = ax + x1
    sy = ay + y1
    ex = ax + x2
    ey = ay + y2

    ' Select the text you wish to copy
    Set DeviceReplay = CreateObject("Mercury.DeviceReplay")

    DeviceReplay.MouseMove sx, sy
    DeviceReplay.MouseDown sx, sy, 0
    DeviceReplay.MouseMove ex, ey
    DeviceReplay.MouseUp ex, ey, 0

    Set DeviceReplay = Nothing
End Function

' Register the functions to the appropriate Test Object Classes.
RegisterUserFunc "WinObject", "AcrobatSelectPartial", "AcrobatSelectPartial"
RegisterUserFunc "WinObject", "AcrobatSelectAll", "AcrobatSelectAll"
RegisterUserFunc "WinObject", "AcrobatCopy", "AcrobatCopy"


' Instantiate the Clipboard object
Set cb = CreateObject("Mercury.Clipboard")

' Clear the Clipboard contents
cb.Clear

' Enable the Text Selection option
AcrobatEnableTextSelection()

' Select all the text in the pdf document and copy it to the clipboard.
Window("Adobe Reader").Window("QTP_SWT_Support.pdf").WinObject("AVPageView").AcrobatSelectAll
Window("Adobe Reader").Window("QTP_SWT_Support.pdf").WinObject("AVPageView").AcrobatCopy

' Get the text from the clipboard using the Clipboard object
pdfText = cb.GetText
msgbox pdfText

Set cb = Nothing


The above example selects the entire PDF file. Is also possible to select text from specified coordinates.

Example:

' Select text in a specified location
' Instantiate the Clipboard object
Set cb = CreateObject("Mercury.Clipboard")

cb.Clear

' Put focus to the pdf document

Window("Adobe Reader").Window("QTP_SWT_Support.pdf").WinObject("AVPageView").Click 0,0

' Specify the coordinates
x1 = 170
y1 = 88
x2 = 543
y2 = 155

' Capture the text from within the specified coordinates. These coordinates are relative to the object, not the screen.
Window("Adobe Reader").Window("QTP_SWT_Support.pdf").WinObject("AVPageView").AcrobatSelectPartial x1, y1, x2, y2
Window("Adobe Reader").Window("QTP_SWT_Support.pdf").WinObject("AVPageView").AcrobatCopy

' Get the text from the clipboard using the Clipboard object
pdfText = cb.GetText
msgbox pdfText


Note:

If no text is selected by the AcrobatSelectPartial function, check the coordinates you used. If the mouse is not over an area that can be selected, QuickTest Professional will not be able to select the text. You can use Paint to help determine the coordinates; make sure you use calculate the coordinates using the specific object and not the entire Acrobat window.

 

Please use plain text.
Advisor
DeepaA
Posts: 28
Registered: ‎11-25-2011
Message 4 of 5 (9,668 Views)

Re: Reading PDF files using QTP

AdobeLabs provides AcroQTP plug-in which can be installed and PDF objects would be recognized by QTP.

Please use plain text.
Occasional Visitor
qtp293
Posts: 2
Registered: ‎08-02-2012
Message 5 of 5 (9,628 Views)

Re: Reading PDF files using QTP

Can you please provide Acrobat plug-in details or link which will help .

 

Thank you:)

Please use plain text.
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation