Windows Search, formerly known as Windows Desktop Search (WDS) on Windows XP and Windows Server 2003, is an indexed desktop search platform created by Microsoft for Microsoft Windows.
Video Windows Search
Overview
Windows Search collectively refers to the indexed search on Windows Vista and later versions of Windows (also referred to as Instant Search) as well as Windows Desktop Search, a standalone add-on for Windows 2000, Windows XP and Windows Server 2003 made available as freeware. All incarnations of Windows Search share a common architecture and indexing technology and use a compatible application programming interface (API).
Windows Search is the successor of the Indexing Service, a remnant of the Object File System feature of the Cairo project which never materialized. Windows Search uses a different architecture.
Windows Search builds a full-text index of files on a computer. (An add-in for 32-bit Windows XP, Windows Server 2003 and Windows Vista allows network shares to be added to the index.) The time required for the initial creation of this index depends on the amount and type of data to be indexed, and can take up to several hours, but this is a one-time event. Once a file's contents have been added to this index, Windows Search is able to use the index to search results more rapidly than it would take to search through all the files on the computer. Searches are performed not only on file names, but also on the contents of the file (provided a proper handler for the file type is installed) as well as the keywords, comments and all other forms of metadata that Windows Search recognizes. For instance, searching the computer for "The Beatles" returns a list of music files on the computer which have "The Beatles" in their song titles, artists or album names, as well as any e-mails and documents that include the phrase "The Beatles" in their titles or contents.
Windows Search features incremental search search (also known as "search as you type"). It begins searching as soon as characters are entered in the search box, and keeps on refining and filtering the search results as more characters are typed in. This results in finding the required files even before the full search text is entered.
Windows Search supports IFilters, components that enable search programs to scan files for their contents and metadata. Once an appropriate IFilter has been installed for a particular file format, the IFilter is used to extract the text from files which were saved in that format.
Windows Search by default includes IFilters for common filetypes, including Word documents, Excel spreadsheets, PowerPoint presentations, HTML files, text files, MP3 and WMA music files, WMV, ASF and AVI video files and JPEG, BMP and PNG images.
Windows Search uses property handlers to handle metadata from file formats. A property handler needs a property description and a schema for the property for Windows Search to index the metadata. Protocol handlers are used for indexing specific data stores. For example, files are accessed using File System Protocol Handler, Microsoft Office Outlook data stores using the Outlook Protocol Handler and Internet Explorer cache using the IE History/Cache Protocol Handler.
Maps Windows Search
Architecture
Windows Search is implemented as a Windows Service. The search service implements the Windows Search configuration and query APIs and also controls, as all indexing and query components. The most important component of Windows Search is the Indexer, which crawls the file system on initial setup, and then listens for file system notifications to pick up changed files in order to create and maintain the index of data. It achieves this using three processes:
- SearchIndexer.exe, which hosts the indexes and the list of URIs that require indexing, as well as exposes the external configuration and query APIs that other applications use to leverage the Windows Search features.
- SearchProtocolHost.exe, which hosts the protocol handlers. It runs with the least permission required for the protocol handler. For example, when accessing filesystem, it runs with the credentials of the system account, but on accessing network shares, it runs with the credentials of the user.
- SearchFilterHost.exe, which hosts the IFilters and property handlers to extract metadata and textual content. It is a low integrity process, which means that it does not have any permission to change the system settings, so even if it encounters files with malicious content, and by any chance if they manage to take over the process, they will not be able to change any system settings.
The search service consists of several components, including the Gatherer, the Merger, the Backoff Controller, and the Query Processor, among others. The Gatherer retrieves the list of URIs that need to be crawled and invokes proper protocol handler to access the store that hosts the URI, and then the proper property-handler (to extract metadata) and IFilter to extract the document text. Different indices are created during different runs; it is the job of the Merger to periodically merge the indices. While indexing, the indices are generally maintained in-memory and then flushed to disk after a merge to reduce disk I/O. The metadata is stored in property store, which is a database maintained by the ESE database engine. The text is tokenized and the tokens are stored in a custom database built using Inverted Indices. Apart from the indices and property store, another persistent data structure is maintained: the Gather Queue. The Gather Queue maintains a prioritized queue of URIs that needs indexing. The Backoff Controller mentioned above monitors the available system resources, and controls the rate at which the indexer runs. It has three states:
- Running: In this state, the indexer runs without any restrictions. The indexer runs in this state only when there is no contention for resources.
- Throttled: In this state, the crawling of URIs and extraction of text and metadata is deliberately throttled, so that the number of operations per minute is kept under tight control. The indexer is in this state when there is contention for resources, for example, when other applications are running. By throttling the operations, it is ensured that the other operations are not starved of resources they might need.
- Backed off: In this state, no indexing is done. Only the Gather Queues are kept active so that items do not go unindexed. This state is activated on extreme resource shortage (less than 5 MB of RAM or 200 MB of disk space), or if indexing is configured to be disabled when the computer is on battery power, or if the indexer is manually paused by the user.
Advanced Query Syntax
Windows Search queries are specified in Advanced Query Syntax (AQS) which supports not only simple text searches but provides advanced property-based query operations as well. AQS defines certain keywords which can be used to refine the search query, such as specifying boolean operations on searched terms (AND, OR, NOT) as well as to specify further filters based on file metadata or file type. It can also be used to limit results from specific information stores like regular files, offline files cache, or email stores. File type specific operators are available as well. WDS also supports wildcard prefix matching searches. It also includes several SQL-like operators like GROUP BY. AQS is locale dependent and uses different keywords in international versions of Windows 7.
Programmability
Users can access the Windows Search index programmatically using managed as well as native code. Native code connects to the index catalog by using a Data Source Object retrieved from the Indexing Service OLE DB provider. Managed code use the MSIDXS ADO.NET provider. One can query a catalog on a remote machine by specifying a UNC path. Programmers specify the criteria for searches using SQL-like syntax. The SQL query can either be created by hand, or by using an implementation of the ISearchQueryHelper
interface. Windows Search provides implementations of the interface to convert an AQS or NQS queries to their SQL counterpart.
The OLE DB/SQL API implements the functionality for searching and querying across the indices and property stores. It uses a variant of SQL in which to represent the query (regular SQL with certain restrictions) and returns results as OLE DB Rowsets. Whenever a query executes, the parts of the index it used are temporarily cached so that further searches filtering the result set need not access the disk again, in order to improve performance. Windows Search stores its index in an Extensible Storage Engine file named Windows.edb
that exists, by default, in the \ProgramData\Microsoft\Search\Data\Applications\Windows\
folder at the root of the system drive in Windows Vista or in later versions of Windows. (The corresponding location in Windows XP is \All Users\Application Data\Microsoft\Search\Data\Applications\Windows\
inside the Documents and Settings
folder.)
The index store, called SystemIndex, contains all retrievable Windows IPropertyStore values for indexed items. Within the SystemIndex folder lurk SystemIndex.*.Crwl and SystemIndex.*.gthr files. The names and locations of documents in the system are exposed as a table with the column names System. ItemName and System. ItemURL respectively. A SQL query can directly reference these tables and index catalogues and use the MSIDXS provider to run queries against them. The search index can also be used via OLE DB, using the CollatorDSO provider. However, the OLE DB provider is read-only, supporting only SELECT and GROUP ON SQL statements.
Windows Search also registers a search-ms
application protocol, which can be used to represent searches as URIs. The search parameters and filters are encoded in the URI using AQS or its natural language counterpart, NQS. When Explorer invokes the URI, Windows Search (which is the default registered handler for the protocol) launches the Search Explorer with the results of the search. In Windows Vista SP1 or later, third-party handlers can also register themselves as the application protocol handler, so that searches can be performed using any search engine which the user has set as default, and not just Windows Search.
The Windows Search service provides the Notifications API component to allow applications to "push" changed items that need indexing to the Windows Search indexer. Applications use the component to supply the URIs of the items that need to be indexed, and the URIs are written to the Gather Queue, where they are read off by the indexer. Microsoft Office Outlook 2007, as well as Microsoft Office OneNote 2007 use this ability to index the items managed by them and use Windows Search queries to provide the in-application searching features. The internal USN Journal Notifier component of Windows Search also uses the Notifications API, monitoring the Change Journal in an NTFS volume to keep track of files that have changed on the volume. If the file is in a location indexed by Windows Search and does not have the FANCI (File Attribute Not Content Indexed) attribute set, the Windows Search service is notified of its path via the Notification API.
Windows Search Configuration APIs are used to specify the configuration settings, such as the root of the URIs that needs to be monitored, setting the frequency of crawling or viewing status information like number of items indexed or length of the gather queue or the reason for throttling the indexer. It also exposes APIs to register protocol handlers (via the ISearchProtocol()
interface, property handlers (via the IPropertyStore()
interface) or IFilter implementations (via the IFilter()
interface). IFilter
implementations allow only read-only extraction of text and properties, whereas IPropertyStore
allows writing properties as well.
Windows Desktop Search
Windows Desktop Search is the implementation of Windows Search for Windows XP and Windows Server 2003. Searches are specified using the Advanced Query Syntax and are executed while the user types (incremental find). By default, it comes with a number of IFilters for the most common file types--documents, audio, video as well as protocol handlers for Microsoft Outlook e-mails. Other protocol handlers and IFilters can be installed as needed.
User interface
The Windows Desktop Search functionality is exposed via a Windows Taskbar mounted deskbar. It provides a text field to type the query and the results are presented in a flyout pane. It also integrates as a Windows Explorer window. On selecting a file in the Explorer window, a preview of the file is shown in the right hand side of the window, without opening the application which created the file. Web searches can be initiated from both interfaces, but that will open the browser to search the terms using the default search engine.
The deskbar also has the capability to create application aliases, which are short strings which can be set to open different applications. This functionality is accessed by prefixing the ! character to the predefined string. For example, "!calc" opens the Windows Calculator. The help documentation includes syntax for creating application aliases out of any text string, regardless of prefix. This feature can also be used to create shortcut for URLs, which when entered, will open the specified URL in browser. It can also be used to send parametrized information over the URL, which are used to create search aliases. For example, "w text" can be configured to search "text" in Wikipedia.
Releases
Windows Desktop Search was initially released as MSN Desktop Search, as a part of the MSN Toolbar suite. It was re-introduced as Windows Desktop Search with version 2, while still being distributed with MSN Toolbar Suite.
For Windows 2000, Windows XP and Windows Server 2003, it came in two flavors, one for home users and the other for enterprise use. The only difference between the two was that the latter could be configured via group policy. The home edition was bundled with MSN Toolbar, while the other was available as a standalone application. Later, when MSN Toolbar was discontinued in favor of Windows Live Toolbar, the home edition of Windows Desktop Search was discontinued as well. The last version available for Windows 2000 is Windows Desktop Search 2.66.
For Windows XP and Windows Server 2003, version 3.0 of Windows Desktop Search was provided as a standalone release - separate from Windows Live Toolbar. One of the significant new features is Windows Desktop Search 3.0 also installs the Property System on Windows XP introduced in Windows Vista. Windows Desktop Search 3.0 is geared for pre-Windows Vista users, hence the indexer was implemented as a Windows Service, rather than as a per-user application, so that the same index as well as a single instance of the service can be shared across all users - thereby improving performance. Windows Desktop Search found itself in the midst of a controversy on October 25, 2007 when Windows Desktop Search 3.01 was automatically pushed out and installed on Windows when updated via Windows Server Update Services (WSUS). Microsoft responded with two posts on the WSUS Product Team Blog.
Windows Search is the indexed search platform in Windows Vista, Windows 7 and Windows Server 2008, and offers a superset of the features provided by Windows Desktop Search, while being API compatible with it. Unlike WDS, it can seamlessly search indexed as well as non-indexed locations - for indexed locations the index is used and for non-indexed locations, the property handlers and IFilters are invoked on the fly as the search is being performed. This allows for more consistent results, though at the cost of searching speed over non-indexed locations. Windows Search uses Group Policy for centralized management.
Windows Search indexes offline caches of network shares, in addition to the local file systems, Microsoft Outlook e-mail stores and Microsoft OneNote stores indexed by WDS Windows Search also supports queries against a remote index. This means if the file server, on which a network file share is hosted, is running either Windows Vista or a later version of Windows or Windows Search 4.0 on Windows XP, any searches against the share will be queried against the server's index and present the results to the client system, filtering out the files the user does not have access to. This procedure is transparent to the user.
Unlike Windows Desktop Search on Windows XP, the Windows Search indexer performs the I/O operations with low priority, the process also runs with low CPU priority. As a result, whenever other processes require the I/O bandwidth or processor time, it is able to pre-empt the indexer, thereby significantly reducing the performance hit associated with the indexer running in the background.
Windows Search supports natural language searches; so the user can search for things like "photo taken last week" or "email sent from Dave". However, this is disabled by default. Natural language search expresses the queries in Natural Query Syntax (NQS), which is the natural language equivalent of AQS.
User interface
The search functionality is exposed using the search bars in the Start menu and the upper right hand corner of Windows Explorer windows, as well as Open/Save dialog boxes. When searching from the Start menu, the results are shown in the Start menu itself, overlapping the recently used programs. From the Start menu, it is also possible to launch an application by searching for its executable image name or display name. Searching from the search bars in Explorer windows replaces the content of the current folder with the search results. The Explorer windows can also render thumbnails in the search results if a Thumbnail Handler is registered for a particular file type. It can also render enhanced previews of items in a Preview Pane without launching the default application, if the application has registered a Preview Handler. This can provide functionality such as file type-specific navigation (such a browsing a presentation using next/previous controls, or seeking inside a media file). Preview handlers can also allow certain kind of selections (such as highlighting a text snippet) to be performed from the preview pane itself. In the Control Panel, the search bar in the window can also search for Control Panel options. However, unlike WDS, Windows Search does not support creating aliases.
There is also a Search Explorer, which is an integrated Windows Explorer window that is used for searches. It presents the user interface to specify the search parameters, including locations and file types that should be searched, and certain operators, without crafting the AQS queries by hand. With Windows Vista SP1, third party applications will be able to override the Search Explorer as the default search interface so that the registered third party application will be launched, instead of bringing up the Search Explorer, when invoked by any means.
In Windows Search, which is part of Windows Vista, it is also possible to save a search query as a Virtual Folder, called a Saved Search or Search Folder which, when accessed, runs the search with the saved query and returns the results as a folder listing. Physically, a search folder is just an XML file (with a .search-ms
extension) which stores the search query (in either AQS or NQS), including the search operators as well. Windows Vista also supports query composition, where a saved search (called a scope) can be nested within the query string of another search. Search Folders are also distributable via RSS. By default, Windows references the profile of the user who originally created a Search Folder as part of the query's scope. This design choice does not prevent saved searches from being shared with other users, but it prevents them from operating on different user profiles. While users can manually modify the contents of a saved search so that the scope references the %USERPROFILE% environment variable, which will enable it to operate on other machines or profiles regardless of the original author, Microsoft has released a SearchMelt Creator utility that automates this process for the user.
Windows Search 4.0
Windows Search 4.0 (also previously referred to as Windows Live Search, codenamed Casino or OneView) is the successor to the Windows Search platform for both Windows Desktop Search 3.0 on Windows XP as well as Instant Search on Windows Vista. It is mainly an update to the indexing components, with few changes to the XP user interface and none on Vista. It also enables remote query support on XP and Windows Server 2003 based systems, which used to be a Vista-only feature. This allows a user with a Vista client (or an XP client with Windows Search 4.0) to search the index of networked machines which are also running a supported operating system (Windows 8, 7, Vista, Windows Server 2008, or XP/2003 with Windows Search 4.0).
Windows Search 4.0 was originally proposed by Microsoft's Windows Live division as an application that would unify local and remote indexed search in a new interface. Early screenshots of the program featured the new "flair" interface design seen in other Windows Live client applications of the time such as Windows Live Messenger and Windows Live Mail.
Windows Live Search Center could search web services which used the OpenSearch specification to make search results available as web feeds. It could aggregate searches from various indexes including the Windows Desktop Search index, Windows RSS Platform common feed store, and Microsoft Exchange and Microsoft SharePoint indexes, among others.
The first beta of Windows Search 4.0 was released on March 27, 2008. It included numerous performance improvements to the indexer and brought new features, including previously Vista-exclusive ones, to XP, including Group Policy integration, federation of searches to remote indexes, support for EFS-encrypted files and Vista-style preview handlers that allow document-type specific browsing of documents in the preview pane.
Windows Search 4.0 was released on June 3, 2008 and is supported on XP, Windows Server 2003, Vista, Windows Server 2008 and Windows Home Server.
See also
- List of desktop search engines
- Microsoft Search Server
- Microsoft enterprise search
- Comparison of enterprise search software
- List of enterprise search vendors
References
Further reading
External links
- Official website
- Previous versions
- List of File Property Filters that can be used with Windows Search
- Filter Central blog
- iFilter.org
Source of article : Wikipedia