What is indexing?

Indexing is the processing of the pages scanned and is what creates the index that uses Google to give results when you search.

In fact, the robots do not keep our pages but the analysis and make an index of all the words they see and their location. In addition, process information in the TITLE tag and the ALT attribute content of the images, nor do they do with all that he has a page, for example, do not process the content of most Flash files or dynamic pages .
Just read HTML documents?

No, also extract index information or other files: PDF, PS (Adobe PostScript), leaves of Lotus (wk1, wk2, wk3, wk4, WK5, WKI, wks, wku, lwp) and Excel (xls), documents MW text, DOC, WRI, RTF, ANS, TXT, PowerPoint presentations (ppt) files, Microsoft Works (wks, wps, wdb) and swf.

This is done to give more results, in fact, can do a search indicating that we display only certain types of files, for example:
filetype: doc “search text”
In most cases, even when we do not have the software necessary to interpret, we show the option of seeing them as HTML or plain text.
Conversely, we can eliminate certain types of search results using a filter, for example:
-filetype: pdf “search text”