While surfing the net I came across this ( in my opinion ) very interesting article. However, the source is authoritative since it is the blog of theInfosec Institute. It is one of the leading schools on cybersecurity in America.
As we know Google is the quintessential search engine. It is the most used in the world, the most accurate, the most studied, and all professionals employed in SEO/SEM use Google as a reference.
In addition to being one of the most powerful databases Google can be used ( by malicious parties ) to find sensitive files, web vulnerabilities, allows identification of operating systems and can also be used to find passwords, databases and even the entire contents of mailboxes.
How is this possible? By using operators that Google provides and by taking advantage of the immense crawling capacity inherent in the Google search engine.
Google allows the use of three logical operators: AND, NOT and OR.
THE AND is use to include multiple keywords in a single search and can be replaced by a single space "", although results differ slightly using either mode
The NOT is used to eliminate certain keywords from a search result. This operator is equivalent to "-" e.g. one can search for "email service -marketing".
The operator OR is used to include either a keyword or another keyword, but not both, in the result of a query, and is equivalent to the use of "|," e.g., "reverse | engineering."
In addition to these three operators Google also allows the use of: ~, +, *,""
Tilde "~" : is used to include in the result of a query with the desired keyword, its synonyms and similar words
"+" character : Google tends to ignore punctuation and eliminate words such as "we," "the," "and" , "of". Using the "+" after a word requires Google to include that word within the search, e.g., "safety is +never complete"
Character "" : is used to search for the exact phrase inserted between the quotation marks
Character "*" : also called the "wildcard" character, is used to replace words that you do not want to specify precisely.
This query returns the definition of a word taken from the most reliable source
Using Filetype you can search for files with a specific extension, restricting the search to a particular file type. For example, "backup filetype:sql"
The use of Ext is very similar to the previous directive except that the latter is used for less common file types and performs a more thorough search
Search for a word found within the title of web pages
Search for a word present within the url of web pages
Intext :keyword / Allintext :keyword1 keyword2 keyword3 ...
Search for a word/words present within the body of the web page
So far there is nothing wrong but we will see that by combining different operators, different keywords ... the results usually exceed our expectations and especially when we are looking for vulnerabilities or some "private" data. This is conventionally called Google Hacking.
For example, we could use Google to find files containing users' names only by typing "allintext: username filetype: log" or find emails only by typing "allintext:email OR mail +*gmail.com filetype:txt" and so on...
Google therefore can be our ally but also our enemy. Its ability to scan the web can create problems for those who do not take the right precautions
Definitely the use of the right CHMOD for sensitive directories and limiting or eliminating access to uploaded backups.
The use of robots.txt file can also save the privacy of your data, you can prevent Google or any other search engine from indexing your website, files or directories by properly regretting a robots.txt file.