Analyzing 136,000 New Domains with COVID-19 Themes

New domains related to COVID-19 are proliferating rapidly. Although many of the new domains are legitimate websites, such as charity pages set up to support response efforts, the sheer volume of new content makes it challenging for security professionals and end-users alike to distinguish which ones are legitimate, questionable, or outright malicious.

To gain a better understanding of this newly-created content, SecureLATAM ATO researchers have compiled and analyzed a list of over 136,000 hostnames and fully qualified domain names with COVID-19 or coronavirus themes from a variety of open-source feeds.

We then parsed, deduplicated, and enriched the data with HTTP, additional DNS analysis, and WHOIS data that was manually collected by SecureLATAM ATO researchers and some proprietary systems for automating data collection from those open sources. Based on this analysis, we have uncovered how many domains have been flagged in public threat intelligence feeds as malicious, which registrars have registered the most COVID-19 themed domains over the last 4 months, and what common keyword trends appeared among these hostnames.

More than 136,000 new COVID-19 themed domains were observed between 12/1 and 3/27.

For the purposes of this analysis, we examined 130,138 hostnames related to COVID-19, including 68,965 subdomains and 61,173 fully qualified domains. Although our raw dataset included 136,886 domains, about 6,748 domains within the raw data failed Whois analysis and were excluded because the Whois servers for those top-level domains were broken or unsupported.

The graph above shows new domain registrations by day. The first small spike on the graph on February 11 reflects the date the World Health Organization named the virus COVID-19 (829 new domain registrations). The graph also reflects an increase in registrations starting around the last few days of February, a time when cases in Europe started to accelerate and newspapers began reporting initial infections in new countries around the world. The chart’s most dramatic spike in domain registrations was on March 12th, the day before President Trump declared a national emergency in the United States (from 1,715 cases on March 11 to 3,305 on March 12). The New York Times has summarized the progression of events within this timeline.

Some registrars are more popular than others—though several providers have promised to crack down on abuse.

The chart below shows the most popular domain registrars used by the COVID-19 themed sites in our dataset.

Some hosting and domain registrars have started to crack down on coronavirus-themed abuse, which is unprecedented. Recently, GoDaddy, NameCheap, and Tucows—three of the top five largest registrars for COVID-19 themes within our dataset—made statements about what they were doing to try and hinder abuse by preventing registrations with certain keywords or actively taking down fraudulent sites. We found this interesting in part because providers have previously resisted this type of action, arguing that to do so would affect free speech. 

This commitment from these providers may help curtail new COVID-19 themed domain registrations, and additional registrars may soon follow suit. However, the tremendous volume of recently-created domains may pose an obstacle as providers investigate potential fraud, which can be a highly manual process. In addition, we speculate that blocking specific keywords will cause bad actors to look for creative ways around these restrictions rather than abandoning their efforts. 

Many of the domains have active web content.

The majority of the domains we analyzed are accepting HTTP and HTTPS requests and have active web content. In other words, typing the domain name into your browser will direct you to a live website. That distinction matters because it means there’s activity happening; someone has taken the time to upload content.

Some of the domains merely display basic content to show that they have been purchased, indicating that they have been parked at the registrar while the owner waits to either use the domain for their own purposes or sell it. Domain scalping may account for some of these purchases; for example, someone might purchase domains related to COVID-19 cures or vaccines with the hope of eventually selling them to a pharmaceutical company.

The most popular keywords we observed tie back to coronavirus response efforts. 

To understand keyword trends across the domains, we examined a combination of singular keywords (COVID, vaccine, donation) and keywords that represent a set of related tags (mask, N95).

A very large number of the domains (28,282) contain the word “virus,” which is unsurprising. “Medical” terms like nurse and doctor also make sense; the medical content could be legitimate, but may also include scams such as ecommerce sites selling fraudulent protective equipment or phishing lures that promise medical information. (We were surprised to find only 15 domains related to toilet paper.)

Given the confusion and uncertainty surrounding testing, it’s also unsurprising to see that “test” ranks as the third most popular keyword. Our researchers noted a small spike in domains using this keyword after President Trump gave a speech promising additional testing.

Note that “.gov website” refers to the 220 new domains in our dataset that have .gov TLDs, indicating that those sites are likely legitimate government websites.

Some TLDs may provide benefits to threat actors.

The vast majority of the domains we identified use .com TLDs. However, we noticed many variations of generic top-level domains (GTLDs), which are popular with criminals because they can help malicious links seem more credible to users. For example, users may interpret GTLDs such as .shop and .store as indicators that they are looking at legitimate ecommerce providers.

At the time of analysis, the majority of the domains in our dataset were hosted in the United States.

Servers hosting COVID-19 themed content were found to be all around the world, with the majority based in the United States.

Within the dataset, we observed numerous examples of scams and phishing pages using domains related to COVID-19.

As we described in our recent post about common COVID-19 scams, threat actors are finding many ways to take advantage of people’s emotions about the global pandemic. As expected, we observed numerous phishing and scam domains when we examined the domains manually.

In the example below, SpyCloud researchers identified a phishing domain that a threat actor has set up to mimic a Chase Bank login. Most likely, the threat actor was sending phishing messages “from” Chase with some form of messaging about the bank’s COVID-19 response, making it seem plausible to users that their bank may have set up a dedicated page related to the virus. (Note that SpyCloud reported this domain to the hosting provider and it has since been removed.)

Relatively few of the domains we analyzed were included in lists of malicious feeds. 

We expected to accelerate our efforts to find malicious domains within our dataset by drawing on public threat intelligence feeds. What we found instead surprised us. 

When we ran all 136k hostnames in our dataset through 12 different community feeds, we were alarmed to find how few of the domains were identified as malicious. Even the Google Safe Browsing API—which has a robust dataset as one of the largest email providers in the world—only flagged 195 domains. Of the domains that Google Safe Browsing did identify, 84.6 percent were flagged for phishing.

We believe that far more of the COVID-19 domains in our dataset are malicious than are reflected in the community feeds cited above. One potential reason is that the feeds we used have a focus on threat intelligence specific to phishing and malware, not necessarily scam sites. In addition, these feeds are sometimes automatically ingested into security products, increasing the potential impact of false positives because they could cause service disruptions in corporate and private networks. 

Conclusion: Here’s how the security research community can help.

The huge volume of new COVID-19 themed content represents a security challenge for users, corporate IT teams, and security professionals alike. Distinguishing between legitimate and malicious content has always been challenging. Criminals know that everyone is trying their best to stay on top of the latest news related to the COVID-19 crisis and will continue to exploit fear for profit.

As researchers, we are encouraged by the contributions we have already seen from the security community. Here are a few of the ways you can get involved, either to contribute research or to keep your own users safer:

Educate users about phishing and internet safety

It’s important to educate your users about identifying phishing attempts and avoiding risky actions like downloading files from unknown senders., powered by the National Cybersecurity Alliance, provides a thorough list of resources. Users can even play a role in helping combat coronavirus-related scams by reporting suspicious messages to email providers and corporate IT. Though flagging a phishing message within your inbox may not feel like a big deal, that action helps providers identify malicious content and flag it for other users. 

Identify IoCs within your own environment

Security professionals can use the COVID-19 Cyberthreat Coalition’s vetted list to look for indicators in their networks that may have COVID-19 themes. Although the list is currently small, it offers one of the most comprehensive lists of confirmed threats. Researchers from many different companies manually vet every item on this list, making false positives unlikely. 

Contribute to community feeds

As a security community, it’s important that we contribute back to public feeds. Identifying domains that are being used for phishing, malware, and other scams will require community effort because of the enormous quantities of content that are coming online. Manual verification is essential, particularly because a lot of threat intelligence feeds are tied into automated equipment like firewalls and other security products. When false positives are introduced, they can interfere with the performance of networking devices and other security products. 

3 views0 comments

Recent Posts

See All