Corpus queries
Standard query
Restricted query
Word lookup
Frequency lists
Corpus info
View corpus metadata
Corpus documentation
About CQPweb
CQPweb main menu
CQPweb manual
Who did it?
Latest news
Angolar: powered by CQPweb
Latest news


  • Version 3.0.13, 2013-11-04
    Implemented context-width restrictions for limited-license corpora.
  • Version 3.0.12, 2013-11-02
    Updated database template for newer MySQL servers.
  • Version 3.0.11, 2013-08-30
    New feature: non-classification metadata fields can now be included in a concordance-download.
  • Version 3.0.10, 2013-04-22
    Added some extra protection against possible XSS (cross-site-scripting) attacks.
  • Version 3.0.9, 2013-04-06
    Added a new feature: queries can now be downloaded as "tabulations".
  • Version 3.0.8, 2013-03-22
    Added a debugging backtrace to the error messages seen by superusers.
    Added Yates' continuity correction to the calculation of Z-score in the Collocation function.
    The usual miscellaneous bug fixes, including one affecting character encoding.
  • Version 3.0.7, 2013-03-19
    Fixed a bug affecting creation of batches of user accounts.
    Fixed a bug causing the number of hits in a categorised query to be displayed incorrectly.
    Fixed a bug causing insertion of line-breaks into queries with long lines.
    Fixed an inconsistency in how batches of usernames are created.
    Fixed a bug in the management of user groups, plus a bug affecting the installation of corpora that are not in UTF-8.
    Fixed a bug in the install/delete corpus procedures which made deletion of a corpus difficult if its installation had previously failed halfway through.
  • Version 3.0.6, 2012-05-15
    More bug fixes.
    Added a new feature: a full file-by-file distribution table can now be downloaded.
    Adjusted the Distribution interface to make it more like the Collocations interface.
  • Version 3.0.5, 2012-02-19
    Just bug fixes, but major ones!
  • Version 3.0.4, 2012-02-10
    New feature: optional position labels in concordance (just like "sentence numbers" in BNCweb) (this feature originally planned for 3.0.3 but not complete in that release).
    Extended the XML visualisation system to allow conditional visualisations (ditto).
    XML visualisations now actually appear in the concordance (but only paritally rendered: they look like raw XML).
  • Version 3.0.3, 2012-02-05
    Mostly a boring bug-fix release, with only one new feature: users can now customise their default thin-mode setting.
    Fixed a bug in concordance download function that was scrambling links to context.
    Fixed a bug in categorisation system that allowed invalid category names to be created.
    Fixed a bug in frequency list creation that introduced forms in the wrong character set into the database.
    Fixed a bug in the keyword function's frequency table lookup process.
  • Version 3.0.2, 2011-08-28
    Added the long-awaited "upload user's own query" function.
    Finished the administrator's management of XML visualisations. Coming next, implementation in concordance view.
    Made it possible for a user to have the same saved-query name in two different corpora.
    Fixed a bug that made non-reproducible random thinning, actually always reproducible!
  • Version 3.0.1, 2011-08-20
    Implemented a better system for sorting corpora into categories on the homepage.
    Fixed a fairly nasty bug that was blocking corpus indexing.
    Fixed an uninformative error message when textual restrictions are selected that no texts actually match (zero-sized section of the corpus). The new error message explains the issue more clearly.
  • Version 3.0.0, 2011-07-18
    New feature: custom postprocess plugins!
    Fixed some bugs in unused parts of the CQP interface.
    Added support for all ISO-8859 character sets.
    Version number bumped to 3.0.0 to match new CWB versioning rules, though CQPweb is in fact now compatible with the post-Unicode versions of CWB (3.2.0+).
  • Version 2.17, 2011-05-18
    Fixed a fairly critical (and very silly) bug that was blocking compression of indexed corpora.
    Added extra significance-threshold options for keywords analysis.
  • Version 2.16, 2011-03-08
    Added a workaround for a problem that arises with some MySQL security setups.
    Added an optional RSS feed of system messages, and made links in system messages display correctly both within webpages and in the RSS feed.
    Created a storage location for executable command-line scripts that perform offline administration tasks (in a stroke of unparalleled originality, I call it "bin").
    Added customisable headers and logos to the homepage (a default CWB logo is supplied).
    Fixed a bug in right-to-left corpora (Arabic etc.) where collocations were referred to as being "to the right" or "to the left" of the node even though this was wrong by about 180 degrees.
  • Version 2.15, 2010-12-02
    Licence switched from GPLv3+ to GPLv2+ to match the rest of CWB. Some source files remain to be updated!
    A framework for "plugins" (semi-freestanding programlets) has been added. Three types of plugins are envisaged: transliterator plugins, annotator plugins, and format-checker plugins. Some "default" plugins will be supplied later.
    Some tweaks have been made to the concordance download options, in particular, giving a new default download style (“field-data-style”).
    For the adminstrator, there is a new group-access-cloning function.
    The required version of CWB has been dropped back down to a late v2, but you still need 3.2.x if you want UTF-8 regular expression matching to work properly in all languages.
    Improvements to query cache management internals.
    Plus the usual bug fixes, including some that deal with security issues, and further work on the R interface.
  • Version 2.14, 2010-08-27
    Quite a few new features this time. First, finer control over concordance display has been added; if you have the data, you can how have concordance results rendered as three-line-examples (field data or typology style with interlinear glosses).
    The R interface is ready for use with this version, although it is not actually used anywhere yet, and additional interface methods will be added as the need for them becomes evident. It goes without saying that you need R installed in order to do anything with this.
    The new Web API has been established, and the first two functions "query" and "concordance" created. Documentation for the Web API is still on the to-do list, and it's not quite ready for use...
    Plus, a new function for creating snapshots of the system (useful for making backups); a "diagnostic" interface for checking out common problems in setting up CQP (incomplete as yet); and some improvements to the documentation for system administrators.
    Also added a new subcorpus creation function which makes one subcorpus for every text in the corpus.
  • Version 2.13, 2010-05-31
    Increased required version of CWB to 3.2.0 (which has Unicode regular expression matching). This means that regular expression wildcards will work properly with non-Latin alphabets.
    Also added a function to create an "inverted" subcorpus (one that contains all the texts in the corpus except those in a specified existing subcorpus).
    Plus, as ever, more bug fixes and usability tweaks.
  • Version 2.12, 2010-03-19
    Added first version of XML visualisation.
    Also made distribution tables sortable on frequency or category handle (latter remains the default).
    Also added support for CQP macros and for configurable context width in concordances (including xml-based context width as well as word-based context width).
    Plus many bug fixes and minor tweaks.
  • Version 2.11, 2010-01-20
    First release of 2010! CQPweb is now two years old.
    Added improved group access management, and a setting allowing corpora to be processed in a case-sensitive way throughout (not recommended in general, but potentially useful for some languages e.g. German).
    Also added a big red warning that pops up when a user types an invalid character in a "letters-and-numbers-only" entry on a form.
    Plus lots of bug fixes.
  • Version 2.10, 2009-12-18
    Added customisable mapping tables for use with CEQL tertiary-annotations.
  • Version 2.09, 2009-12-13
    New metadata-importing functions and other improvements to the internals of CQPweb.
  • Version 2.08, 2009-11-27
    Updated internal database-query interaction. As a result, CQPweb requires CWB version 2.2.101 or later.
    Other changes (mostly behind-the-scenes): enabled Latin-1 corpora; accelerated concordance display by caching number of texts in a query in the database; plus assorted bug fixes.
  • Version 2.07, 2009-09-08
    Fixed a bug in context display affecting untagged corpora.
  • Version 2.07, 2009-08-07
    Enabled frequency-list comparison; fixed a bug in the sort function and another in the corpus setup procedure.
  • Version 2.06, 2009-07-27
    Added distribution-thin postprocessing function.
  • Version 2.05, 2009-07-26
    Added frequency-list-thin postprocessing function.
  • Version 2.04, 2009-07-05
    Bug fixes (thanks to Rob Malouf for spotting the bugs in question!) plus improvements to CQP interface object model.
  • Version 2.03, 2009-06-18
    Added interface to install pre-indexed CWB corpus and made further tweaks to admin functions.
  • Version 2.02, 2009-06-06
    Fixed some minor bugs, added categorised corpus display to main page, added option to sort frequency lists alphabetically.
  • Version 2.01, 2009-05-27
    Added advanced subcorpus editing tools. All the most frequently-used BNCweb functionality is now replicated.
  • Version 1.26, 2009-05-25
    Added Categorise Query function.
  • Version 1.25, 2009-04-05
    Added Word lookup function.
  • Version 1.24, 2009-03-18
    Added concordance sorting.
  • Version 1.23, 2009-03-01
    Minor updates to admin functions.
  • Version 1.22, 2009-01-20
    Added support for right-to-left scripts (e.g. Arabic).
  • Version 1.21, 2009-01-06
    Added (a) concordance downloads and (b) concordance thinning function.
  • Version 1.20, 2008-12-19
    Added (a) improved concordance Frequency Breakdown function and (b) downloadable concordance tables.
  • Version 1.19, 2008-11-24
    New-style simple queries are now in place! This means that "lemma-tags" will now work for most corpora.
  • Version 1.18, 2008-11-20
    The last bits of the Collocation function have been added in. Full BNCweb-style functionality is now available. The next upgrade will be to the new version of CEQL.
  • Version 1.17, 2008-11-12
    Links have been added to collocates in collocation display, leading to full statistics for each collocate (plus position breakdown).
  • Version 1.16, 2008-10-23
    Concordance random-order button has now been activated.
  • Version 1.15, 2008-10-11
    A range of bugs have been fixed.
    New features: a link to “corpus and tagset help” on every page from the middle of the footer.
  • Version 1.14, 2008-09-16
    Not much change that the user would notice, but the admin functions have been completely overhauled.
    The main user-noticeable change is that UTF-8 simple queries are now possible.
  • Version 1.13, 2008-08-04
    Added collocation concordances (i.e. concordances of X collocating with Y).
    Also added system-messages function.
  • Version 1.12, 2008-07-27
    Upgrades made to database structure to speed up collocations and keywords.
  • Version 1.11, 2008-07-25
    Added improved user options database.
  • Version 1.10, 2008-07-13
    Added frequency list view function, plus download capability for keywords and frequency lists.
  • Version 1.09, 2008-07-03
    Added keywords, made fixes to frequency lists.
  • Version 1.08, 2008-06-27
    Added collocations (now with full functionality). Added frequency list support for subcorpora.
  • Version 1.07, 2008-06-10
    Added collocations function (beta version only).
  • Version 1.06, 2008-06-07
    Minor (but urgent) fixes to the system as a result of changes to MySQL database structure.
  • Version 1.05, 2008-05-23
    Added subcorpus functionality (not yet as extensive as BNCweb's).
  • Version 1.04, 2008-02-04
    Added restricted queries, and successfully trialled the system on a 4M word corpus.
  • Version 1.03, 2008-01-23
    Added distribution function.
  • Version 1.02, 2008-01-08
    Added save-query function and assorted cache management features for sysadmin.
  • Version 1.01, 2008-01-06
    First version of CQPweb with fully working concordance function, cache management, CSS layouts, metadata view capability and basic admin functions (including username control) -- trial release with small test corpus only.
  • Autumn 2007.
    Development of core PHP scripts, the CQP interface object model and the MySQL database architecture.
Known bugs as of 2008-12-19


  • Query history
    Items in query history should be auto-deleted after a set time (one week); this doesn't seem to be happening. (added 2008-06-15)
  • Text metadata table
    Text metadata does not make > < & safe as entities for HTML. (added 2008-06-10)
  • Query history
    "Insert query" links in column 3 of the history display don't work if the restriction is a subcorpus. (added 2008-06-07)
  • Flyby infoboxes
    In Internet Explorer, flyby infoboxes don't appear. (added 2008-06-07)
    This only happens when CQPweb is accessed over the Internet. Over Intranet, the popup boxes appear fine. This seems to be something to do with Windows/IE security settings blocking the JavaScript that creates the infoboxes. IE doesn't block the script over Intranet; nor, apparently, over HTTPS.
    Update: in Google Chrome, the flyby boxes appear intermittently for some corpora (haven't yet checked on other browsers).

03-2011: CRPC v2.0
11-2011: CRPC v2.1 more metadata!
04-2012: CRPC v2.2 sentences and chunked NPs
10-2012: CRPC v2.3 livro and revista divided into técnico/didáctico/litérario in restricted query

12-2012: The Child Corpus
08-2013: The Mozambique Corpus
11-2013: Update to CQPweb v3.0.12

CQPweb v3.0.12 © 2008-2013
Corpus and tagset help This is the unregistered version.
Registered users can use this version.
To ask for registration, write an email to