mnoGoSearch for Windows version history can be found here.
24 Oct 2001: 3.2.2
Added meta "Content-Language" processing, added "lang" attribute
processing for <html> and <body> tags.
Added IBM DB2 support. Tested with DB2 EE V7.1.
Stored and storedoc.cgi added. Now it possible to store and
display compressed copy of indexed documents with
search words hilighting.
Tag values are now passed using "tag" form variable so that
the variable meaning is more clean. Old "g" form variable does not
work anymore.
Major documentation improvements and reorganization.
Fixed that category and language limits were not working.
Fixed that StopwordFile command didn't work in search.htm
Fixed that full/substring/beginning/ending word match didn't
work.
Fixed crash in ServerTable code.
Fixed crash in synonyms code on some platforms.
qtrack table fileds types changed.
Fixed bug in MySQL single mode code. It could kill mysqld
server when documents is big enough.
Fixed that iso-8859-1 entities like é were
not properly converted to unicode.
Fixed that HTML parser considered scripts body as
a text in some cases.
install.pl installation script has beed added.
Some minor configure script and code clean-ups.
27 Sep 2001: 3.2.1
New "Listen" searchd.conf command. It allows to bind
searchd to specified host and/or port.
searchd now can reload searchd.conf when
signal HUP is arriving.
Added some signal safeness in searchd.
Fixed that searchd.conf-dist were not included into
distribution.
Fixed that national letters in the code range.
128-255 were considered as word separators when
searchd is used. They also were not displayed in
search results (body, title, etc fields).
Fixed some bugs in HTML tag parsers that caused
indexer to stall or crash in some cases.
Fixed that "Proxy" command was ignored.
Fixed that robots.txt related code could
stall or crash in threaded version.
Fixed compilation with Oracle problem.
Fixed compilation problem with errno.h on Solaris.
24 Sep 2001: 3.2.0
Now one can compile with several SQL databases support at
the same time.
Now one can make a bibary distribution using "make bin-dist".
Added new program searchd. Among other features, it allows
to build a search cluster, distributing between several machines.
Support for synonyms fuzzy search has been added.
Common words endings fuzzy search using ispell now works in
3.2 branch.
New "ReverseAlias" indexer.conf command. This command has
the same format with "Alias" command. However, URL mapping
is executed just after the moment when new link has been found.
URL is stored into database after ReverseAliases applying.
Among other things it allows for example to index PHP driven
sites which add an unique session ID in the form
"PHPSESSION=344646342345df". ReverseAlias is able to remove such
substrings from URLs.
New "Subnet xxx.xxx.xxx.xxx" indexer.conf command. It works
like Realm but checks an IP address matching instead of URL.
For example, "Subnet 195.239.38.*" or "Subnet NoMatch 192.*.*.*".
Search results highlighting (HlBeg and HlEnd search.htm commands)
now works in 3.2 branch.
CT-Lib support has been added. Now one can
use mnoGoSearch together with SyBase and MS SQL
natively, without ODCB drivers. Both original SyBase
CT-Lib and FreeTDS CT-Lib are suppored. However Ct-Lib
driver is still in beta.
indexer now works approximately twice faster with Interbase.
Added deflate and compress Content-Encoding's support.
New VarDir command in search.htm. It works like the same
indexer.conf command but at search time.
New "Section" indexer.conf command. It is to be used instead of
old ***Weight commands, which have been removed. Take a look into
indexer.conf-dist and search.txt for an explanation.
Now it is possible to index user-defined META tags as well as
HTTP response headers.
New "Alias" command in search.htm. It works like "Alias"
in indexer.conf but at a search time.
Added support for external includes in search template.
Format differs from 3.1.x version. Take a look into
"templates.txt" for usage information.
"Alias" command has been extended. Now it can optionaly use
powerful URL mapping using regular expressions like in "Realm"
command.
Posix threads now should work not only Linux and FreeBSD.
Detection for threads for a number of platforms has been added.
libudmsearch compilation with pthreads fix. It fixes
Apaches with PHP mnoGoSearch extension module crashes
when mnoGoSearch was compiled with pthreads support.
Tag parser has been rewritten. It now properly process
tag attrubites with '>' signs, like for example <META NAME=email
Contents="<general@mnogosearch.org>">.
Earlier '>' signs inside quotes was consideter as a tag
endings.
Apple Darwin fixes for configure scripts
Extended number of query parameters stored in qtrack table
Added url.charset field. Charset is now stored separately from
content_type field. Please recreate or ALTER "url" table structure.
"Clones yes/no" has been renamed to "DetectClones yes/no"
to avoid confusions.
8 Aug 2001: 3.2.0.b1
Content encoding support added (currently gzip only).
Requires libz to compile. Use --with-libz to activate.
Fixed that $(DE) template variable was not working
Fixed that the correct charset was forgotten after
robots.txt processing.
3 Jul 2001: 3.2.0.b0
Charsets processing has been rewritten. Now mnoGoSearch supports
almost all widely used charsets: various single-byte charsets as
well as multi-byte charsets including UTF-8, Chinese (BIG5,
GB2312),
Korean (EUC-KR), Japanese (S-JIS). All internal processing works
using UNICODE representation. Using UTF8 as a LocalCharset one can
build a multi-lingual search engine with languages which could not
be indexed at the same time in 3.1.x branch, for example
German+Greek+Russian+Chinese.
Character sets module has a new automatic language and charset
detection. Currently more than 70 various charsets and languages
can be detected automatically when they are not specified in
"Content-type" and "Content-Language" server's response headers or
html META tags.
News extensions now compiled without --enable-news-extensions.
Use "NewsExtensions yes" indexer.conf command to activate them.
Case sensitive Allow/Disallow/CheckOnly/HrefOnly commands have
been added.
New "Realm <regex>" indexer.conf command has been added.
It works like "Server" but takes a regular expression as it's
parameter. Servers are not sorted by URL length after loading
anymore. They are found in the order of appearence in indexer.conf.
It means that if you want different parameters for server
subsections, use "Server" command for subsection first, then
command "Server" for the whole server.
Default indexer.conf has been reorganized. All commands now
devided into five logical sections.
Crash in UdmFreeParsers() has been fixed.
A bug in page navigator has been fixed. Bug since 3.1.7.
A bug in new Period format has been fixed.
A minor bug which caused slightly non-standard "Accept-Charset"
line in request header has been fixed.
Subtle bug in indexer has been fixed. Indexer followed a redirect
link given in "Location" HTTP header even if "Follow page" or
"Follow no" is specified for current "Server".
16 October 2000: 3.1.7
indexer.conf parameters Period,
MirrorPeriod, NetErrorDelayTime,ReadTimeout
now can be specified in more convenient way.
New different Followindexer.conf command
values have been added for more flexible spidering configuration.
New follow_type optional Serverindexer.conf
command argument has been added to specify site realm.
FollowOutsideindexer.conf command has been removed.
Use Follow world instead.
New URLindexer.conf command has been added.
It allows to specify alternative server's entry points.
New ServerTable table_nameindexer.conf command
has been added. It loads server entries with all their parameters from
the database and makes it possible remote servers configuration through
the web application.
New CREATE TABLE script for servers table have been added.
splitter -h now displays short help page.
<NOINDEX> and </NOINDEX> have been added
as a synonim for <!--UdmComment--> and <!--/UdmComment-->
Subtle bug which produced core dump in cache mode search code has
been fixed.
A bug in search.c which caused threaded version
compilation problems has been fixed.
Thread stack size has been increased to avoid threaded
indexer crashes.
11 October 2000: 3.1.6
search.cgi template name detection has been fixed to
support content negotiation. This makes possible to install
multi-language search pages.
DefaultLangindexer.conf parameter has been
added to set default language for server. Suitable if you are
using language restriction while doing search.
InterBase related configure stuff has been fixed.
Some memory leaks in InterBase driver have been fixed.
A bug which possibly was a reason of crashes in document
CRC32 calculation has been fixed.
Some ftp code improvements in symlinks processing.
A bug in charsets handling which affected introduced in
3.1.4 Hebrew and Baltic charsets have been fixed.
A bug in reindexing code for "multi" mode has been fixed.
This fixes "table was not locked" error message in MySQL.
A bug in that $if() was not work has been fixed in search.cgi
A bug in splitter for cache mode has been fixed.
27 September 2000: 3.1.5
New "cache" storage mode which is able to index
and quickly search through the millions of documents.
New $SearchTime templare variable.
New $DE template variable. It displays description
when not empty and text overwise.
Boolean search has been implemented in search.cgi
It has the same with PHP fron-end syntax.
Greek iso-8859-7 and cp-1253 character sets support
has been added. Thanks Dimitri Bougoulias .
Hebrew iso-8859-8 and cp-1255 charsets support has been added.
Baltic iso-8859-4, iso-8859-13 and cp-1257 support has been adeded.
New $DY template variable to display document category.
Meamory leak in PostgreSQL driver has been fixed.
"indexer -i -f urllist.txt" core dump bug since 3.1.4 has been fixed.
MP3 code cleanups.
Binary search in host names cache has been added.
18 September 2000: 3.1.4
Search results cache support has been added into search.cgi.
This allows very fast output when query is repeated for
example while navigation through search result pages.
New command Cache yes/no has been added into search template.
Support for Last-Modified and If-Modified-Since has been
implemented for file: URL scheme. This makes significantly faster
reindexing on not modified local files.
Server news://servername/groupname syntax support for
NEWS groups has been added into indexer.conf
Support for nntp:// URL scheme has been added.
"Subject:" and "From:" headers are currently decoded
according to RFC 1522.
MP3 headers processing has been added.
HTTP Proxy Basic Authorization support has been added.
<META NAME="Language" Content="xx"> processing has been added.
Host names cache has been added into indexer.
Some configure.in fixes and improvements have been done.
libudmsearch interface functions have been slightly changed.
Shared library creation has been implemented.
A bug which caused external parsers hangup sometimes has been fixed.
A bug which caused hard CPU loading in threaded indexer version
has been fixed. Thanks Peter Hanecak <hanecak@megaloman.com> for this.
A bug in robots.txt and gethostbyname() mutexes locking has been fixed.
A minor bug in URL parser has been fixed.
New indexer command line argument to limit maximum indexing time.
A bug which caused wrong document charset detection in some cases
has been fixed.
A bug that indexer did not understand spaces and
special characters in file: URL scheme has been fixed.
A bug that clones were not displayed in search.cgi has been fixed.
New huge English stop-list has been added.
Danish stop-list has been added.
Slovak stop-list has been added.
3 Aug 2000: 3.1.3
$g template variable has been improved to support
parameter $g(X). This displays a tag without X trailing
characters.
Polish stop-list has been added. Thanks Maciek Uhlig
<muhlig@us.edu.pl> for contribution.
Some Oracle improvements have been done.
Some crc-multi mode improvements have been done.
A bug in udm_strnlen which could caused a buffer overflow in
"URL too long" messages has been fixed.
A bug which caused wrong keywords fetching from sql backend
has been fixed.
New categories and tags releated documentation.
A bug in URL cache which was a reason of losten links
has been fixed. Thanks Willem Brown <willem@brwn.org>
for discovering this problem.
crc32() has been renamed into UdmCRC32() to avoid compilation
problems on some platforms.
--disable-file configure argument were not work.
Oracle tables structure has been fixed.
Minor search.cgi speed improvements.
Compilation problem on Alpha has been fixed.
A bug that content type was modified after
external parser call has been fixed.
Some InterBase related bugs have been fixed.
13 Jul 2000: 3.1.2
HTTP and FTP date parsing functions replaced to ones taken from
Apache to avoid using strptime (that is broken on
Solaris 2.6)
Code added to search.cgi to 'remember' the state of HTML
checkboxes
and radiobuttons between pages.
Query tracking facility has been implemented.
Use "TrackQuery yes" template command to enable it.
New template sections navleft_nop, navright_nop,
noquery has been added.
New "udm_recursion" search.cgi parameter to skip
$iurl() directives has been added. This allows to avoid recursion
when search.cgi includes itself.
A patch by The Hermit Hacker to use WHERE rec_id IN(...)
when displaying search results has been applied. This should
make search a bit faster.
Text between <STYLE> and </STYLE> is not indexed anymore.
Perl and PHP front-end are removed from the package and distributed
as a separate packages since now.
A minor Oracle related stuff in configure.in has been fixed.
Some Oracle improvements have been made.
A minor bug in ispell support has been fixed.
A bug in external parsers code has been fixed.
A bug which appeared when HREF has leading or traling spaces
has been fixed.
Content-type were not escaped. This caused SQL failure in
some cases.
30 June 2000: 3.1.1
Nested categories support has been added.
Tag type has been changed to CHAR instead of INT.
The above changes require updating database structure.
"Category"indexer.conf command has been added
"$CP" template variable to display category path has been added
"$CS" template variable to display current category subtree has
been added
"cat"search.cgi parameter has been added to pass the category to
search through
"$cat" template variable to display current category ID.
A bug in cp1250 and iso-8859-2 support has been fixed
Some functions moved from log.c to udmutils.c
udm_snprintf for Solaris has been added
Some bugs have been fixed
15 June 2000: 3.1.0
Native FTP support added so you can index ftp sites without using proxy.
Low-level network functions layer added (work in progress).
New advanced search options (time limits). You can search for recent pages and so on.
md5 hash replaced to crc32 in database.
Some (incompatible) database structure changes needed for the above.