dtSearch document filters support a broad range of
25+ full-text and fielded data search options
- Supports MS Office through current versions (Word,
Excel, PowerPoint, Access), OpenOffice, ZIP, HTML,
XML/XSL, PDF and many other formats
- Supports Exchange, Outlook, Thunderbird and other
popular email types, including nested and ZIP
- Spider supports public and secure, static and
dynamic (ASP.NET, SharePoint, CMS, PHP, etc.) web
- APIs for SQL-type data, including BLOB data
- Highlights hits in all supported data types
APIs for C++, Java and .NET through current versions
- Federated searching
- Special forensics search options
- Advanced data classification objects
- Native 64-bit and 32-bit Win / Linux APIs; .NET
- Document filters also available for separate
Products include: • dtSearch Web with Spider •
dtSearch Engine for Win & .NET • dtSearch Engine for
Linux • dtSearch Publish (for portable media) •
dtSearch Desktop with Spider • dtSearch Network with
dtSearch Desktop with Spider,
dtSearch Network with Spider
dtSearch Desktop with Spider and dtSearch Network with
Spider instantly search popular file types on a PC or a
Both products index and search (with highlighted hits) a
wide variety of local and network content, including PDF,
HTML and XML files; word processor, database, spreadsheet,
presentation and similar "Office" files; ZIP files; and
email messages (Outlook Express, Outlook, Exchange, Eudora,
.MSG and other email formats) along with the full-text of
attachments. Through the dtSearch Spider, both applications
also support adding Web-based content to a local or
dtSearch Publish offers easy publishing of an instantly
searchable document collection to CD, DVD or other portable
media. The product can also mirror an existing Web site on
CD/DVD. The resulting CD/DVD application runs with "zero
footprint" on the end-user’s computer.
dtSearch Web with Spider
dtSearch Web with Spider quickly publishes a large volume
of instantly searchable data to an Internet or Intranet
site. Visitors accessing the site can instantly search with
hithighlighted display of Webready content (HTML, PDF, XML),
including WYSIWYG display of all images, formatting and
links. dtSearch Web automatically converts other file types
("Office", Unicode, ZIP, etc.) to HTML for browser display
with highlighted hits, making non- Web-based files
dtSearch Web installs through an easy wizard-based setup.
The application offers numerous customisation options, so
all aspects of the dtSearch Web display can match the style
of the rest of the site.
Operating through dtSearch Web, the Spider can expand the
scope of the searchable database beyond a site’s own data to
static and dynamic content on third-party sites.
dtSearch Text Retrieval Engine for Win & .NET,
dtSearch Text Retrieval Engine for Linux
The dtSearch Engine lets developers add dtSearch’s
builtin file format support and searching to Web-based and
other applications. The dtSearch Engine for Linux offers
APIs in C++ and Java. The dtSearch Engine for Win & .NET
includes sample code and API support for C++, Delphi, Java,
and .NET—including C#, VB.NET, ASP.NET and ADO.NET. (See
over for special database options for SQL and XML, and other
fielded data information.) The dtSearch Engine also includes
a .NET Spider API.
The dtSearch Spider embedded in multiple dtSearch
products provides integrated searching of remote Web site
content, along with local data. In addition to support for
the file formats above, the dtSearch Spider can also index
and search dynamically generated content, such as
ASP/ASP.NET, MS CMS, SharePoint, etc.
Over two dozen search options
The dtSearch product line generally offers over two dozen
indexed, unindexed, fielded and full-text search options.
dtSearch also offers Unicode support for hundreds of
international languages, special forensics options, and
advanced database search options.
How dtSearch Works
- dtSearch can instantly search terabytes of text
because it builds a search index that stores the
location of words in documents.
- Indexing is easy—simply select folders or entire
drives to index and dtSearch does the rest.
- dtSearch automatically recognises and supports all
popular file formats, and never alters original files.
- A single index can hold over a terabyte of text (up
to billions of documents).
- dtSearch can also create—and search with a single
search request—an unlimited number of indexes.
- Since you may sometimes want to search files that
dtSearch has not indexed, dtSearch also does unindexed
as well as "combination" searching.
- See more details below on file format support and
Basic Search Types
- Phrase searching finds phrases
due process of law.
- Boolean operators like and/or/not
can join words and phrases: due process of law and
not (equal protection or civil rights).
- Proximity searching finds a word or
phrase within "n" words of another word or phrase:
apple pie w/38 peach cobbler.
- Directed proximity searching finds
a word or phrase "n" words before another word or
phrase: apple pie pre/38 peach cobbler.
- Phonic searching finds words that
sound alike, like Smythe in a search for
- Stemming finds variations on
endings, like applies, applied,
in a search for apply.
- Numeric range searching finds any
number between two numbers, such as between 6 and 36.
- Macro capabilities make it easy to
include frequently used items in a search request.
- Wildcard support allows ? to hold a
single letter place, and * to hold multiple letter
apple* and not appl?sauce.
- Fuzzy searching uses a proprietary algorithm to find
search terms even if they are misspelled.
- Fuzziness adjusts from 0 to 10 so
you can fine-tune fuzziness to the level of OCR or
typographical errors in your files.
- A search for alphabet with a fuzziness of 1
would find alphaqet; with a fuzziness of 3, it
would find both alphaqet and alpkaqet.
- Fuzziness is not built into the index, so you can
vary fuzziness at the time of each search.
Concept / Synonym / Thesaurus Searching
- Concept searching lets you look for fast
and find quick, speedy, etc.
- dtSearch offers variable levels of automatic
synonym expansion based on a comprehensive semantic
network of the English language.
- You can also add your own thesaurus terms.
Relevancy Ranking and Natural Language
- dtSearch can sort and instantly re-sort searches by
relevancy with respect to number of hits, file name,
file date, etc.
- Natural language algorithms let you
enter a "plain English" or unstructured indexed search
- Relevancy ranking in a natural
language search is based on the frequency and density of
hits in your files.
- For example, in the search request get me Sam's
memo on the 1999 CorpX takeover, if 1999
appeared in 3,000 files, and Sam appeared in
only two files, then Sam would get a much
higher relevancy rating, taking you straight to the most
- A positional scoring option ranks
documents more highly when hits are near the top of a
file, or otherwise clustered in a file.
- Variable term weighting works with
both natural language searching and structured search
requests to provide additional positive or negative
weighting to user-specified terms.
- Variable term weighting can also apply to document
OCR and Imaging
- dtSearch supports the PDF "image with hidden text"
format, and highlights hits
right on the scanned image in this format.
- dtSearch also supports combined text and image
display in HTML and other Web-based formats.
- dtSearch recommends using fuzzy searching for
sifting through possible OCR errors.
Special forensically-oriented features include:
- Automatic parsing of text segments in large data
blocks, such as those recovered through an "undelete"
process, from unallocated computer space, or from
partially recovered file fragments
- Language recognition algorithms for detecting text
in a variety of languages (Western European,
- Automatic recognition of dates, email addresses, and
credit card numbers
- A proprietary tool for converting Outlook and
Exchange message stores, including the full-text of all
attachments, to .MSG, for convenient access without
requiring Outlook and Exchange running.
dtSearch includes support for hundreds of
- Unicode support built into all dtSearch products
allows for indexing and searching of non-English text,
including every character set, in hundreds of languages
supported by the Unicode standard.
- Search options that automatically work on text in
any language include: fuzzy (adjustable from 0 to 10);
natural language with automatic relevancy-ranking;
variable term weighting; phrase; boolean (and/or/not);
proximity and directed proximity; wildcard; macro;
numeric range; and fielded data (alone or combined with
Language Extension Pack
- The dtSearch product line includes an English noise
word list and stemming rules (to find words such as
learn, learned, learns, learning, etc. that are
- dtSearch's UK distributor offers pre-packaged sets
of noise word lists and stemming rules covering over 25
- For more information on the Language Extension Pack,
please contact dtSearch, or visit www.dtsearch.co.uk.
FindPlus Distributed Searching
Through FindPlus, a single search request can span
multiple local and remote locations.
- For example, a single search request can span
multiple indexes residing on a hard drive, a local area
network, an Intranet server, and even a
publicly-available Web server.
- The same search request can then return
comprehensive search results from all locations, ranking
all retrieved files by relevance, and instantly
re-sorting, for example, by file date.
- FindPlus will display all retrieved files with
highlighted hits (as well as HTML, XML and PDF links,
images and formatting), even if the files reside on
- FindPlus distributed searching uses an XML-based
protocol for streaming search results, making it easy
for developers to integrate this feature into
Using related technology, the dtSearch Spider embedded in
multiple dtSearch products supports indexing with instant
searching of combined remote Web site and locally available
- In addition to PDF, HTML, XML, "Office" documents,
ZIP repositories, and the like, the dtSearch Spider can
also index and search dynamicallygenerated content, such
as ASP/ASP.NET, MS CMS and SharePoint.
- The Spider can follow links vertically within a URL,
or horizontally across URLs, to any specified level of
- The Spider supports public sites, secure content
HTTPS sites, password-accessible sites and forms-based
- After a search, the Spider provides integrated
relevancyranking and display of local and Spidered
content, including WYSWYG display with highlighted hits
of Web-ready content.
Databases and Fielded Data
- dtSearch indexes and searches fields in all
supported file types, supporting both full-text
searching encompassing fields and search requests
limited to specific fields.
- After a search, dtSearch can display field values in
- dtSearch can also index and search popular database
formats, including ODBC support and support for BLOB
- For XML, dtSearch supports hierarchical field
structures in XML data, covering both fields and
attributes, enabling highly refined nested field
- For SQL, the dtSearch Engine supports precision
fielded data searching through ADO.NET and COM.
Classification and Filters
- dtSearch developer products include extensive
support for adding on-the-fly classification information
and other fields to documents during indexing.
- dtSearch developer products can filter results based
on document full-text contents, document fielded data,
database content, or data attributes attached during
- Search filter objects allow multiple users with
different security classifications to search the same
document collection, without having to maintain separate
indexes corresponding to each classification level.
- A single query can include an "exact phrase"
full-text end-user search request; second-level Boolean
search expressions (such as one or more field or
metadata search criteria); and developer-added filtering
expressions (such as a filtering expression to filter
out documents that do not match a user’s security
|For more information please contact
the MicroWay sales team:
MicroWay Pty Ltd
PO Box 84,
Braeside, Victoria, 3195, Australia
Ph: 1300 553 313
Fax: 1300 132 709
ABN: 56 129 024 825
|Sydney Sales Office
MicroWay Pty Ltd
PO Box 1733,
Crows Nest, NSW 1585, Australia
Tel: 1300 553 313
Fax: 1300 132 709
ABN: 56 129 024 825
|New Zealand Sales Office
MicroWay Pty Ltd (NZ)
PO Box 912026
Victoria Street West
Auckland 1142, New Zealand
Tel: 0800 450 168
+61 3 9580 1333, fax +61 3 9580 8995