Simple Search

Greek examples on this page are given in Beta code, to prevent forcing a particular encoding on different browsers.

The Simple Search page can be accessed either at the end of a Canon Search (i.e. once the user has selected the authors and/or works to be searched) or directly from the sidebar. If the search is accessed through the Canon interface, it is restricted to the texts selected there. If you select a Full Corpus Search from the sidebar, the search engine by default searches through the Word Index for complete words. For searches involving segments of words or non-text, or otherwise not going through the Word Index (i.e. Textual Search), see Advanced Search.


Search for:

To begin a search, type the string to search for (or use the interactive keyboard to enter the characters). This is normally a word in Beta code or Latin transliteration, although your configuration may make it possible to enter your search in a Greek font. Case is normally ignored in Simple Search (case sensitive searches are allowed in Advanced Search).

When you enter your word, the search engine compares it to the Greek words it knows about, and reports back the matching words it finds. The search string is treated as a prefix of possible words; this allows users to search over stems rather than just words. For example, if you enter the string ELLEB or elleb, the search engine returns a list of 9 words starting with ELLEB (with or without diacritics):

Search for ELLEB.*:
7 matching words found in index (totalling 16 instances)
  • E)LLEBORI/ZEIS (1)
  • E)LLEBORI/SAI (1)
  • E)LLE/BORON (10)
  • E(LLE/BORON (1)
  • E(LLE/BOROS (1)
  • E(LLEBO/ROU (1)
  • E(LLEBO/RW| (1)

Note that in wildcard searches through the word index, the search string is treated as a substring rather than a prefix, and the signs ^ and $ (or space) are needed to anchor the beginning and end of a word. This makes it possible for you to search for infixes and suffixes of words.

Each word is followed by the number of instances it occurs in the full corpus. For example, the form E)LLE/BORON occurs 10 times in the corpus, but the form E(LLE/BORON occurs only once. At the head of the list, the count of instances of the words are given; in this case, there are seven distinct words, and 16 total instances in the corpus.

If you are performing a search not on the entire corpus, but on specific authors or works, then the search engine displays only the words attested in your subset of the corpus. However, the word counts given will still be those for the full corpus.

Given this list, you may either select the words on which you wish to conduct your search, and press Selected Words, or press All Words to conduct a search for all the word forms listed.

If you wish to specify a precise word form, rather than a prefix, you should append space at the end of the word. For example, a search for ANDROS returns the forms A)/NDROS, A)NDRO/S, A)NDROSQE/NHN, A)NDROSQE/NHS, and A)NDRO/SFIGGAS. A search for ANDROS followed by space, on the other hand, will return only A)/NDROS and A)NDRO/S. (To search for one rather than the other, you will need a diacritics-sensitive search.)

Note that the search engine will not allow retrieval of more than 500 distinct word forms from the word index. The retrieval stops at the first 500 entries, out of which the words found in your subcorpus are selected. Thus, even if you actually see only a couple of words returned, you are still warned by the search engine that the search is incomplete. If you wish to search for different word forms with a mechanism more fine-tuned than prefix search, you may consider a wildcard search.

Once you have specified the word or words you are searching for, and press Selected Words or All Words, the search engine retrieves the instances of the words from the corpus. The result is highlighted in red. At the top of each search page, the number of matched words is given, as well as the total number of matched words that will be considered; this gives you some idea of how far along in the search results you are. For example, the search for ELLEB above yields the following result page:

Search for ELLEB. (searched 3 out of 16 instances):

1. Demosthenes Orat., De corona. {0014.018}. Section 121 line 4.

GOREUE/TW.' TI/ OU)=N, W)= TALAI/PWRE, SUKOFANTEI=S; TI/ LO/GOUS
PLA/TTEIS; TI/ SAUTO\N OU)K
E)LLEBORI/ZEIS E)PI\ TOU/TOIS; A)LL'
OU)D' AI)SXU/NEI FQO/NOU DI/KHN EI)SA/GWN, OU)K A)DIKH/MATOS OU)DE-
 (5)

2. Plutarchus Biogr. et Phil., Alexander. {0007.047}. Chapter 41 section 7 line 2.

41.
(7.) KA)KEI=NON [QU=SAI] E)KE/LEUSEN. E)/GRAYE DE\ KAI\ *PAUSANI/A| TW=|
I)ATRW=| BOULOME/NW| TO\N *KRATERO\N
E)LLEBORI/SAI, TA\ ME\N
A)GWNIW=N, TA\ DE\ PARAINW=N O(/PWS XRH/SHTAI TH=| FARMAKEI/A|.

3. Plutarchus Biogr. et Phil., Demetrius. {0007.057}. Chapter 20 section 3 line 4.

DIH=GEN. *)/ATTALOS D' O( *FILOMH/TWR E)KH/PEUE TA\S FARMAKW/-
DEIS BOTA/NAS, OU) MO/NON U(OSKU/AMON KAI\
E)LLE/BORON, A)LLA\ KAI\
KW/NEION KAI\ A)KO/NITON KAI\ DORU/KNION, AU)TO\S E)N TOI=S BASI-
 (5)

The criteria for what constitutes a word according to the TLG word index are discussed at some length in Technical Note on Greek Word Definition. In brief, hyphenated words are joined together, intervening words beta escapes are ignored, word accentuation is normalized (graves are conflated with acutes, second accents are removed).

In addition, word fragments are listed separately from words; a fragmentary start of a word is denoted by an initial !, and a fragmentary end of word by final !. For example, a search for !ANA yields six word forms (!ANA, !ANA!, !ANA/GA!, !A)NAGNO/NTES, !ANAL!, !ANAP!); the first results page looks as follows:

Search for !ANA. (3 out of 8 instances):

1. Sophocles Trag., Fragmenta (Radt). {0011.008}. Fragment 213**,col 1 line 8.

[] @1
[ ]
[]
!N
[]ANA?:
[]!TOS[!]S?

2. Sophocles Trag., Fragmenta (Radt). {0011.008}. Fragment 133** line 2.

(133**.) [!!!!]EITOU![
[
!!!!]ANA![
[!!!]E![!]RAT' E[

3. Sophocles Trag., Fragmenta (Radt). {0011.008}. Fragment 1132**,12 line 1.

(1132**,12.) ...
[      ]ANA?[  (1)
[]BOLH KLU?[

The program generating the Word Index (Word Indexer) has worked out that all these instances of ANA have a fragmentary initial beginning (indicated as a lacuna by the right bracket); but the first instance does not have a fragmentary ending, as it is not followed by a bracket indicating a further lacuna (unlike the next two instances).

The instance of !A)NAGNO/NTES points to a more general problem with the word index:

6. Aristophanes Comic., Fragmenta (Austin). {0019.016}. Fragment 66 line 5.

]           :M?[
]A)NAG?N?O/?N?T?E?s3?[ @1 (5)

While a human being would recognize that A)NAGNO/NTES is meant to be a complete word, the program generating the word index treats the preceding ] as denoting a lacuna. In general, the contents of the word index are only as good as the markup of the underlying text; where errors have crept into the text, or where the markup relies on human knowledge (as in this instance, where the bracket should have been separated from the ensuing word by a space, to ensure it is not considered fragmentary), those errors will be reflected in the word index. The same holds for misspellings, orthographical peculiarities, and so on.

Note finally that words with distinct accentuation are listed separately; for example, E(LLE/BORON is considered a distinct word from E)LLE/BORON. This also extends to words in all-capitals, which normally have no accentuation: ILIADOS (all-capitals in the titles of the books of the Iliad) is listed as a distinct word form from I)LIA/DOS (in discussions of the Iliad).


Results per page:

This option determines how many search results may be displayed at a time on a web page. If the count is exceeded in a search, the website appends a button (Next Results) prompting the user to continue the search (see Display of Results).


Lines of context per result:

This option determines how many lines of text should be displayed for each search result. When 0 is selected, the system will display citations (with links to the Canon) only. If 1 line is selected, the results are shown in abbreviated form; if the word sought is broken across more than one line (e.g. hyphenated), the website attempts to display two lines of context rather than one.

The following illustrates a search with 1, 3, and 5 lines of context:

1.

1. 0006.020: 327.3. LEITW=N A)P' OI)/KWN EU)= LE/GH| PE/NHS A)NH/R,

2. 0006.020: 378.1. NU=N D' H)/N TIS OI)/KWN PLOUSI/AN E)/XH| FA/TNHN, (3)

3. 0006.020: 453.11. TA\N D' E)XQRA\N STA/SIN EI)=RG' A)P' OI)/- {2A)NT.}2
KWN TA\N MAINOME/NAN T' E)/RIN (10)

3.

1. Euripides Trag., Fragmenta (Nauck). {0006.020}. Fragment 327 line 3.

SOFOU\S TI/QESQAI TOU\S LO/GOUS, O(/TAN DE/ TIS
LEITW=N A)P'
OI)/KWN EU)= LE/GH| PE/NHS A)NH/R,
GELA=N: E)GW\ DE\ POLLA/KIS SOFWTE/ROUS

2. Euripides Trag., Fragmenta (Nauck). {0006.020}. Fragment 378 line 1.

OU) TOU)/NOM' AU)TOU= TH\N FU/SIN DIAFQEREI=.
(378.) NU=N D' H)/N TIS OI)/KWN PLOUSI/AN E)/XH| FA/TNHN,
PRW=TOS GE/GRAPTAI TW=N T' A)MEINO/NWN KRATEI=:

3. Euripides Trag., Fragmenta (Nauck). {0006.020}. Fragment 453 line 11.

TA\N D' E)XQRA\N STA/SIN EI)=RG' A)P' OI)/- {2A)NT.}2
KWN TA\N MAINOME/NAN T' E)/RIN (10)
QHKTW=| TERPOME/NAN SIDA/RW|.

5.

1. Euripides Trag., Fragmenta (Nauck). {0006.020}. Fragment 327 line 3.

(327.) FILOU=SI GA/R TOI TW=N ME\N O)LBI/WN BROTOI\
SOFOU\S TI/QESQAI TOU\S LO/GOUS, O(/TAN DE/ TIS
LEITW=N A)P'
OI)/KWN EU)= LE/GH| PE/NHS A)NH/R,
GELA=N: E)GW\ DE\ POLLA/KIS SOFWTE/ROUS
PE/NHTAS A)/NDRAS EI)SORW= TW=N PLOUSI/WN
 (5)

2. Euripides Trag., Fragmenta (Nauck). {0006.020}. Fragment 378 line 1.

PAI=DAS FUTEU/EIN: O(\S GA\R A)\N XRHSTO\S FU/H|,
OU) TOU)/NOM' AU)TOU= TH\N FU/SIN DIAFQEREI=.
(378.) NU=N D' H)/N TIS OI)/KWN PLOUSI/AN E)/XH| FA/TNHN,
PRW=TOS GE/GRAPTAI TW=N T' A)MEINO/NWN KRATEI=:
TA\ D' E)/RG' E)LA/SSW XRHMA/TWN NOMI/ZOMEN.

3. Euripides Trag., Fragmenta (Nauck). {0006.020}. Fragment 453 line 11.

I)/QI MOI, PO/TNA, PO/LIN. @1
TA\N D' E)XQRA\N STA/SIN EI)=RG' A)P' OI)/- {2A)NT.}2
KWN TA\N MAINOME/NAN T' E)/RIN (10)
QHKTW=| TERPOME/NAN SIDA/RW|.
(454.) TEQNA=SI PAI=DES OU)K E)MOI\ MO/NH| BROTW=N


Greek display:

The Greek display option determines which font or scheme the Greek text of the results of any search will be displayed in. The options given include the more widely used polytonic Greek fonts, as well as Roman transliteration schemes. For further details on configuring your browser to use those fonts, see Font Configuration. The fonts and transliterations used are:


Display of Results:

In any given session, the specified number of search results are displayed, in the Greek font specified by the user, and with the number of lines of context per instance specified. At the bottom of the page, the user may click Next Result to see the next page of results, if further results are expected. The user may also click Printable form to display the results in a form ready for printing (without hyperlinks, colors, or headers.)

If 10 seconds have elapsed and the requisite number of results for the page has not yet been found, the website displays a form asking the user whether to continue the search (until the requisite number of results is found), or to display the results gathered so far --- which may be none.


Display of Beta Escapes:

The texts in the TLG corpus do not contain only Greek text. They contain milestones indicating the current citation in the text. They also contain codes indicating formatting, sigla, the semantics of the text, and so forth. These codes are known as Beta Escapes, and are documented in the TLG Beta Code manual.

Any display system will be able to render on the screen only a subset of the information specified in Beta code --- either because it exceeds the current capabilities of the system (e.g. character stacking, <10), a character is not available to display it (e.g. the diple periestigmene, #14), or the code encodes content rather than formatting per se (e.g. stage directions, {). Any Beta escapes unable to be resolved with the current display options (that is, rendered as it looks on the printed page) are displayed as raw beta codes in green.

Where Beta codes are displayed on the screen, whether resolved or not, they are always hyperlinked to definitions incorporating a picture of the symbol and comments on instances of deviant usage. These definitions are drawn from the TLG Beta Code manual. For instance, in the following example (from the Hippiatrica Berolinensia), the Beta escapes are underlined; they include not only the unresolved symbol for ounce, #106, and the numeral signifier (keraia) #, but also the initial indentation of the paragraph and the angle brackets around KO/STOU:

1. Hippiatrica, Hippiatrica Berolinensia. {0738.001}. Chapter 22 section 18 line 1.

MA/TIZE. @1
(18.) 

A)/LLO PRO\S BH=XA KAI\ BOULSOU/S. (t)
  *KINNAMW/MOU LI/TRAN MI/AN #106 D#, KASTORI/OU #106 D#, STU/RA- (1)
KOS #106 D#, O)PI/OU #106 B#, SMU/RNHS #106 D#, <KO/STOU #106 D#>, KASI/AS
#106 D#, A)KO/ROU #106 D#, BALSA/MOU SPE/RMATOS #106 B#, A(LUSA/QROU #106

Clicking on any instance of # will yield the following definition:
# ´

Numeric signifier or abbreviation marker

Keraia (Prime); The numeric vs. abbreviation denotation of the sign is ambiguous in the corpus (e.g. PLA/G# = PLA/GIOS in 3023). When numbers are indicated with overbar (<) in the text or no explicit typographical indication, the prime indicates fraction instead (Gardthausen 1913:II 373). E.g. 0363.014: G# is 1/3, rather than 3. Double # can mean fraction, as in 2021.038.

To identify Beta escapes for the user, all instances of Beta escapes are green. However, Beta escapes are not highlighted when this would result in ambiguity between an unresolved and a resolved Beta escape: unresolved Beta escapes are always highlighted. Thus, ? is an (unresolved) subscript dot, while normal ? is a Roman question mark. The pairs are:

SignResolvedUnresolved
?Roman Question mark (%1)Subscript dot
!Exclamation mark (%4)Missing letter
%Percentage sign (%8)Crux
#Hash sign (%101)Numeric signifier
^Caret (%104)Tab space
'ApostropheSingle Quotation mark ("3)*
&Ampersand (%9)Roman font
[[Opening Double Bracket ([4)Two opening square brackets
]]Closing Double Bracket ([4)Two closing square brackets
<Open Single Guillemet, Open Angle Bracket, Diple, Line filler, Drachma ("7, [2, #15, #101, #323)Open Overbar
>Close Single Guillemet, Close Angle Bracket, Diple, Line filler, Half Drachma ("7, ]2, #18, #1512, #1337)Close Overbar
<<Opening Double Angle Bracket ([18)(Two Open Overbars)
>>Closing Double Angle Bracket ([18)(Two Close Overbars)
*Highlighted as an escape code as distinct from the textual apostrophe, not because it is unresolved.


Greek Input:

This option specifies how the user will type their search — whether in a transliteration scheme, in Beta Code, Unicode, or Monotonic (Modern Greek). See Input In Greek for information on how to set your browser for Greek text input, and what its limitations may be.

Note: stress accents cannot be entered in (unaccented) Latin transliteration, so searches sensitive to stress are disallowed in that encoding. In addition, the coronis cannot be specified in transliteration, as the default value of the character used is the apostrophe.


Suppress raw beta escapes:

TLG texts contain codes indicating formatting, sigla, the semantics of the text, and so forth. These codes are known as Beta Escapes, and are documented in the TLG Beta Code manual.

By specifying Suppress raw beta escapes, the display of Beta escapes unable to be resolved under the current display options is suppressed. Compare:

&nehmer an den von Sisyphos aus Anlab der Anschwemmung des toten
&Melikertes zu Ehren des Poseidon gestifteten Isthmischen Spielen). @1
(2.)   &Pap. Ox. 2250  (1n)
(16a.) 
&(oberer Rand)$
{4%43?}4 A)/]GE DH/, BASILEU=, #74 [%40 %40 %41 %40 %40 %41  (1)
KAI\ CU/MPASAN #74 M[%40 %40 %41 %40 %40 %41
TOU= BAQUPLOU/TO[U #74 %40 %40 %41 E)/CW
P
?ENI/AS NAI/WN #74 K[AI\ %41 %40 %40 %41
PAL?]I?KH\N SKH/PTR?[WI #74 %40 %40 %41 %40 %40 %41,  (5)
DE/C]A?[I?] ME?[!!!] #74 F?I?[LI/AS XW/RAS
nehmer an den von Sisyphos aus Anlab der Anschwemmung des toten
Melikertes zu Ehren des Poseidon gestifteten Isthmischen Spielen).
(2.)   Pap. Ox. 2250  (1n)
(16a.) 
(oberer Rand)
A)/]GE DH/, BASILEU=, [  (1)
KAI\ CU/MPASAN M[
TOU= BAQUPLOU/TO[U E)/CW
PENI/AS NAI/WN K
[AI\
PAL
]IKH\N SKH/PTR[WI ,  (5)
DE/C]A[I] ME[] FI[LI/AS XW/RAS

In the second example, the square brackets remain in place, though highlighted, because they have been rendered into their proper typographic form. All other escapes were unresolved, and thus have been eliminated.

This option is incompatible with Including Beta escapes.


Diacritics sensitive:

If this option is specified, the search is sensitive to Greek diacritics, which are otherwise ignored in the search string. In the simple search form, the search is either sensitive to all diacritics (breathings, stress, iota subscript) or none. In the advanced search page, you can select any combination of these diacritics. For example, a diacritics-sensitive search for E)LLEB will return only the following results:

The instances of the Hellebore stem with rough breathing are ignored. Since diacritics-sensitivity involves both breathings and accents, the instance E)LLE/BORON, which does not match E)LLEB, is also ignored. However, if you specified Breathings Only in the Advanced search page, the acute would be ignored, and E)LLE/BORON would also be retrieved.

Whatever the diacritics mode specified, diaereses are always ignored.


Link to Perseus:

If this option is specified, each word in the results page is hyperlinked to the corresponding morphological analysis entry on the Perseus website (or one of its mirrors), with the Greek display option corresponding to the display specified for the search (or Latin transliteration, if this is unavailable at Perseus.) Only Greek words are analysed in this way; words in Roman script and beta escapes are excluded.

Users may select to link to Perseus home (Tufts U.) or one of its mirrors by using the pull-down menu.

The Perseus morphological analysis specifies each word form and gives dictionary entries corresponding to the form in the online Liddell-Scott-Jones dictionary. Note that the TLG text holdings are more extensive than the coverage of Liddell-Scott-Jones, ranging over papyrological and mediaeval material (including some vernacular) as well as epic and classical Greek. Furthermore, words in the TLG texts are frequently fragmentary, abbreviated, non-lexical (e.g. numerical strings, magical incantations), non-Greek (e.g. transliterated Latin, Biblical names), and so on. So there will frequently arise word forms for which Perseus can supply no morphological analysis and no dictionary entry.

A few more details about links to Perseus:

Case is preserved in the hyperlink, and resolved at Perseus. For example, an instance of *TO\N links to the analysis of TO/N. Perseus differentiates ambiguous cases between a proper name and a common noun: E(STI/A links to the analysis of 'hearth', but an instance of *(ESTI/A links to the analysis of 'Vesta'. (The dictionary entry typically includes both alternatives.) Strictly speaking, *(ESTI/A can also mean 'hearth' at the start of a sentence with modern capitalization; most TLG texts, however, do not capitalize in this fashion.

If the entry is all in capitals, and thus has no diacritics, Perseus may judge the form to be ambiguous. If the ambiguity only involves accent, Perseus will generate all the possible forms. For example, *K*A*L*W will generate a morphological query for KALW; Perseus will then generate analyses for all of:

However, if the ambiguity involves breathing, Perseus queries which breathing is required. For example, *O*R*O*S will generate a morphological query for OROS; Perseus will then query whether O)ROS 'mountain' or O(ROS 'boundary' is intended. Similarly, *E*S*T*I*A will generate a morphological query for ESTIA; Perseus will then query the user whether E)STIA or E(STIA is desired. Since the form E)STI/A does not exist in the dictionary, Perseus will then generate the possible analyses of E(STIA:

--- though not *(ESTI/A 'Vesta'.


Ignore Incomplete words:

This option (which applies only to word index searches) eliminates from the search results all incomplete words — namely, all words beginning or ending with a word fragment sign (!).


Save My Searches

The TLG allows you to save up to 10 completed searches (Canon or textual searches) to review at a later time.

Created: Feb. 14, 2000
Last Modified: March 12, 2009
Maintained by tlg-support@uci.edu
TLG® is a registered trademark of The Regents of the University of California.