How to automate with Mouse, Keyboard, and OCR

Symptom

If an automation cannot be performed with Web or UI automation actions, then the best alternative is surface automation using keys/images/mouse clicks and OCR.

For example, some common cases where Web/UI automation is difficult/not feasible are the following:

  • We cannot capture the Web/UI elements neither through the Recorders nor through the Actions

  • The web page/application does not allow external software to interact with it

  • The selectors of the Web/UI elements are changing dynamically

  • We are trying to automate a web page/application through Citrix/RDP

Verifying issue

To verify that the Web/UI actions cannot be used to interact with a web page/application, first try accessing and interacting with the elements through the Web/Desktop recorders, and then through the Web automation/UI automation actions.

In case of a web page/web application, make sure that the correct settings are in place.

In case of a desktop application, make sure that the app does not run as administrator. If that's the case, then run PAD as admin as well.

Best practices

Bring to focus

Before starting to build your surface automation, always make sure that the right window is focused and maximized.

Actions that can be used:

  • Focus window

  • Set window state

Wait

PAD actions are being executed very fast, so you should allow your system some time to respond to them. Hence, a wait action should be used prior to any following action.

Also note that the execution of a desktop flow is much faster when running it either from the PAD console or from a cloud flow, than running it through the PAD designer.  

Actions that can be used:

  • Wait

  • Wait for image

  • Wait for text on screen (OCR)

Send mouse clicks & keys

To navigate through an application/web page or populate text fields, you can use the mouse and keyboard actions.

Actions that can be used:

  • Send keys

  • Press/release key

  • Move mouse to image

  • Send mouse click

  • Move mouse to text on screen (OCR)

Extract data

To retrieve text from the screen and store it to a variable, you can use either the clipboard or OCR actions.

Actions that can be used:

  • Get clipboard text

  • Extract text with OCR

The "Get clipboard text" action stores the clipboard text to a variable. For this to work, you should first store the text in the clipboard.

To store the text in the clipboard, highlight the text using either the "Send keys" action (for example, send the CTRL+A keyboard shortcut) or the "Send mouse click" action (for example, send a "Left button down" at the beginning of the text, and then a "Left button up" at the end of the text). Finally, the CTRL+C keyboard shortcut will store the text in the clipboard.

Need more help?

Expand your skills
Explore Training
Get new features first
Join Microsoft Insiders

Was this information helpful?

Thank you for your feedback!

Thank you for your feedback! It sounds like it might be helpful to connect you to one of our Office support agents.

×