Are your organization RPA robot's being victims of spear-phishing?

Introduction

Can the robots be dummier than humans on detecting phishing campaigns? Yes, they can, they are only as smart or as dumb as we program them.

Can a software robot really be qualified as a victim? I don't really know this one, but I can assert that I've seen a lot of bullying and finger pointing towards RPA robots by their human colleagues. But again, the robots only do what they are instructed to do, no more no less.

Can an organization that applies RPA, to save some costs and reduce errors on their processes, be unknowingly opening a hole for a criminal mastermind to peek? Maybe. It depends, if their RPA developers read this article or not.

The concept is simple, a robot that reads emails from a mailbox, and does some actions on them (specially use an URL present on the email or download attachments) but it does not validate the sender's email address domain. This kind of robot might be in danger of opening a nefarious email, that arrived from a shady domain, and open a hole on the company's security system.

Due to the long article, I placed a TLDR section right here in the beginning for those who eat their lunch standing up, because they don't have time for anything.

TLDR;

To answer the question in the title, most probably not, but yes there is a slim chance. Without the proper validations on the code side, there is a change that a RPA automation can be a target of spear-phishing.

If the attackers know the mailbox address(es) that the robot reads, some of the filters and actions it does, and without any validations for phishing emails on the company side. They can start a spear phishing campaign on those mailboxes with the hopes of succeeding an attack.

A simple fix for this vulnerability is to add a validation on the sender email address domain against a trusted domains list. Other tips are described on the latest chapters of the article.

Vector of attack

I came up with the idea for this article when while I was doing some routine security trainings. And I had a thought that our robots don't do these security trainings, so logically they are not aware of all of the current vectors of attack that criminals use nowadays. (I like to image robots as company employees btw)

And generically speaking, with many robots deployed across an entire company, there could exist at least a bunch of robots that read all of the emails that arrive at the Inbox and open URLs or download attachments mindlessly. They might filter for some subjects, or some words on the body of the email, or some dates, but they don't check many more things, especially the sender domain.

And if, you as an attacker, can know the checks they do and know the type of emails they are expecting, you can add some malicious pieces of software embedded on an attachment or even simply, an URL to a malicious website.

So, it is clear that spear-phishing is the major enemy for these attacks.

What is spear-phishing?

This is a type of attack that shares a lot of similarities if normal phishing emails but has a higher success rate in comparison. It presents itself as a normal email, but contains hyperlinks or attachments that contain some sort of malicious intents. The hyperlinks might redirect to a malicious website, while the attachments might contain macros or other pieces of malicious code in them.

Until now it seems that I'm describing a regular phishing attack, but the main difference is the specificity of the email itself. For these attacks the email is just targeting a person, or a small group of people. Also, the attacker spends much more time learning the routines of these people and what are their expected situations and behaviors on day to day operations. They can even go as far as searching for small details of personal information that when added in a context on the email, they might increase the trustability of the whole attack. So as imagined, these sorts of attacks are a real danger to a company, adding to the fact that our mailboxes are always super busy with emails coming and going.

For robots, the danger is not getting to know their personal lives, because they have no life outside the office and their machines. But the real danger is the attacker discovering the full details of which checks the robots do and how they do them, so they can prepare their emails with the best possible chance of success.

More on the topic: https://www.fortinet.com/resources/cyberglossary/spear-phishing

The damage

While the probability of the attack is small, the potential damage can be quite serious. As the effectiveness of attacks nowadays tend to increase, these attacks can cause more harm than just the exfiltration of data. Specially with attachments, these attacks can lead to unauthorized access to the system, and from there things can escalate quickly.

So, in the worst-case scenarios, we can assume:

Compromised Data: The attacker may gain unauthorized access to sensitive data within the RPA system or the connected systems it interacts with. This can include confidential business information, customer data, financial records, or intellectual property. The compromised data can be exploited for financial gain, industrial espionage, or used for further attacks.
Disrupted Operations: An RPA system is often responsible for automating critical business processes. If the attacker successfully compromises the system, they may disrupt or manipulate these processes, leading to operational chaos, errors, or delays. This can impact productivity, customer service, and business continuity.
Unauthorized Access to other systems: RPA robots often require access to various systems and applications to perform their tasks. If the attacker gains control of the robot, they may abuse this access to exploit vulnerabilities in connected systems, gain deeper access privileges, or pivot to other parts of the network. This can result in further security breaches or unauthorized actions within the organization's infrastructure.
Financial Loss: Successful spear-phishing attacks on RPA systems can have financial ramifications. The organization may face financial losses due to compromised funds, fraudulent transactions, or ransom demands from the attacker. Additionally, the costs associated with incident response, system restoration, legal actions, and reputation damage can be significant.
Reputational Damage: A successful attack on an RPA system can erode the organization's reputation, particularly if sensitive customer data is exposed or operations are disrupted. Trust among clients, partners, and stakeholders may be compromised, leading to potential loss of business and negative public perception.

Proof of concept

Unsecure code

Here I prepared a small POC to demonstrate the effectiveness of these attacks on a simple RPA automation. This automation simulates a dispatcher that reads a mailbox and consists of:

reading all the emails on a mailbox, one by one;

There is also the reading of the domain list first (in this case an excel file). And then the email extraction with an office 365 connector (for easy manageability).

validating if the email contains a specific expression on the subject (just for credibility);
extract all URL from the email body and validate the there is only one URL (for even more credibility = more business rules);
open a Firefox browser with the extracted URL;

open Firefox

and the rest of operations don't matter, because the attack already took effect... 😈

What happens?

And now I gave it two examples. One that does not follow the rules and ... our "special" email. And here are the results:

logs of execution

First, I gave it a random email from the university, and it failed because the subject was not compliant with the business rules (meaning they are working).

The second was the prepared email with the nefarious URL redirect, and it was processed. It passed all business validations and Firefox opened.

Here are the emails in question:

emails used for testing

And here is an email, that I received when the robot opened the URL on Firefox:

exfiltrated data

This shows that our attack worked perfectly. And there are some information's about the target.

Secure code

Now let's see an overview the security code that I can add to prevent these types of emails.

validate inputs and domain extraction

First there is a validation on the input email strings. Then the extraction of the domain from the email addresses.

domain validation

Then the extraction of the minimum common domain, to ensure that both addresses (the "from" and the "sender", belong to the same domain).

domain validation part 2

Lastly verify if this common domain exists on the supplied list of trusted domains. First an exact match is tested, then a match that allows for sub-domains is tested. And if one of the two matches is true, then the domain is valid and trusted.

Here is a list of tests done and it seems to work correctly against different scenarios.

tests

I've added this workflow before all business validation and re-ran the previous emails.

Re-run tests with secure code

These are the logs of the run:

logs of execution with secure code - domain check failed

So both of the email did not pass on the domain test thus failing and were not processed (the expected behavior).

logs of execution with secure code - domain check passed

Adding the domain "uab.pt" to the list, we can see again that both emails passed on the security validation, even the one coming from a sub-domain. So, this shows that the filtering is working.

But also, that we only should add to the list of trusted domains the ones are that really needed, in this way we remove a lot of surface area of attack. So, it does not matter if we implement this, to open the filter wide open afterwards.

The code is available on GitHub. As this is just a POC, it is normal that the code is not refactored for best performance.

Results

The malicious email in question has a canary token associated to the URL (a redirect URL). And once the URL is opened, the attacker receives information that it was opened and some details on the target that opened it.

This example does not show any major breach, but shows the exfiltration of some infrastructure details.

As you can see, even with vulnerable code, a lot of stars have to be aligned for the attack to succeed (the email needs to arrive with the right subject, the right number of URLs, and on real scenarios, more business validations/rules might be applied). So, the attacker already has to know a lot of information to have success on an attack like this. But sometimes the information is given away by an inside job (mafia style) or just on a RPA presentation these details are given away unwillingly.

More on Canary tokens

Think of canary tokens as little digital decoys that you strategically place around your system. These decoys look like totally normal stuff that people might want to click on or interact with, like files, URLs, or email addresses. But here's the catch: they're secretly designed to tip you off if someone unauthorized tries to mess with them.

When a canary token gets triggered, it quietly sends an alert to the system admin or the security team. It's like having a little bird watching over your system, warning you about potential breaches before things get messy. They also can extract some information from the attacker's identity, like the IP address.

So, as we you can see, these tokens are used more on the defensive side, to add additional security to an existing system. And surely, there are more other tools more oriented to these types of attacks. But as this is just a POC, I've decided to use it for simplicity purposes, as the focus here is not the attack itself but to demonstrate the weakness point of the automation.

More on this: https://www.fortinet.com/resources/cyberglossary/what-is-canary-in-cybersecurity and https://docs.canarytokens.org/guide/#what-are-canarytokens

Simple ways of fixing this possible vulnerability

The first piece of advice I can provide is to not give away the above described business and technical implementations of the processes to third parties outside of your organization. This does not mean that the process should not be documented. It just means that the information needs be stored but not shared with outsiders. And also, you should check, if on your public presentations and demonstrations, you give these details away. But unfortunately, this alone will not prevent an attack from happening, but just will make it more difficult for the attacker to know the details for starting a campaign.

The second and most important piece of advice, that was previously mentioned above in the article, is to add a validation for the sender email address domain against a list of trustable domains. This list, can be:

a simple asset of the type string (on the orchestrator) with the domains separated by commas or semi-colons;
a file (text, csv, xlsx) placed on a folder, accessible for the business users and the robots, for quick and easy updates on the list;
a database, with read permissions to the robot and write permissions to the users;
some other way that allows for updates on the list by the users and read of information by the robot.

The most important detail on building this list is to narrow it to only the domains used by each process. And try to stick at maximum to custom domains (that really belong to an existing company or entity), for example: acme.com or uipath.com or somerandombank.com Adding domains available to regular users like gmail.com and outlook.com is a no-no. And the reason for this is simple, some random person can easily create an account on these services and use it for an attack.

Of course, if you really need to accept all possible incoming emails, just make sure to clearly log all processed emails (the valid ones and the invalid ones) with the names of attachments it worked on, or URLs it extracted. And also try to add as many business rules as you can. For example, if you extract URLs, and you expect that all URLs are coming from the same website (a ticketing platform, for example) you can compare and validate if the domain of the website is the same as the one on the URL.

Is this solution effective against emails coming from "spoofed domains"?

Emails with spoofed domains are deceptive emails where the sender manipulates the email header information to make it appear as if the message is originating from a different domain than the actual source. In other words, the email's "From" field is forged to mislead the recipient into believing it comes from a trusted sender or organization.

This problem should be fixed on an organization level, implementing email authentication protocols like SPF, DKIM, and DMARC, helps to detect and prevent spoofed emails from reaching recipients' inboxes. So this "spoofed domain" problem is eradicated for good for the whole company, and this applies to RPA as well.

But if your company does not implement these protocols, you can rest assured that, yes this solution is in fact effective against this tactic, and still allowing for sub-domains belonging to the main domain to be accepted.

In overview

As demonstrated, even if there are out there automations with this flaw, the likelihood of an attack like this is small. It is necessary to know very extensively all of the implementation details + circumvent all other security measures implemented on the organization. So, without an insider source, it gets really hard to do an attack like this one.

But if you give some credit to Murphy's law, you know we are better safe than sorry. So keeping our code secure and without vulnerabilities, as small as they are, is the best way to act.

Introduction​

TLDR;​

Vector of attack​

What is spear-phishing?​

The damage​

Proof of concept​

Unsecure code​

What happens?​

Secure code​

Re-run tests with secure code​

Results​

More on Canary tokens​

Simple ways of fixing this possible vulnerability​

Is this solution effective against emails coming from "spoofed domains"?​

In overview​