"Since the mere possession of writings does not give knowledge, how are we to extract from this almost incomprehensibly large collection of written records the knowledge that we need?"
Enormous strides have been taken to revolutionize the speed and manner in which information on international sales law is disseminated. The University of Freiburg's CISG online website,
Uniquely, a large percentage of the resources that have been established - UNCITRAL, Unilex, Pace, the Members of the Autonomous Network of CISG Websites - have offered their information on the Internet free of charge. This has enabled persons from all geographical backgrounds to utilize the sources, thus taking great strides toward the development of the global jurisconsultorium that is necessary to foster the uniform application of the law.
The challenge is that in international sales law, more attention has been paid to the amount of information that is disseminated rather than the manner in which it is presented. A predictable retrieval system should be the next step to build solidly upon that which the international community has already created. There is an incorrect presumption that researchers know the legal concepts about which they need information and how to obtain this information. If the West approach had simply been to provide a comprehensive system of reporting without systematically classifying the information obtained, it is possible that [page 236] the ability to properly rely upon precedent would have been strained because of the conflicts arising in jurisdictions based on the inability to obtain necessary legal materials.
"The development of a homogeneous body of law under the [Sales] Convention depends on the channels for the collection and sharing of judicial decisions and bibliographic material so that experience in each country can be evaluated and followed or rejected in other jurisdictions."
There are three systems currently in place for the retrieval of materials on international sales law - the print index, computer-based Boolean searching and a system of organization based on substantive legal content. Each of these systems is analyzed in turn to determine whether any of them can provide the necessary structure for the creation of the framework to conceptualize international sales law.
(1) The traditional print index " . . . allows a researcher to locate relevant information in a collection of documents. The most common type lists subjects alphabetically, followed by a reference allowing the researcher to locate the information in the document collection."
(2) The second system is computer-based and relies on Boolean searching as the major means to find relevant information. Boolean searching, discussed in more detail infra, allows a user to search machine-readable files for keywords that best describe a topic. A unique feature of the Boolean system is that the user can combine keywords or phrases using the operators "and", "or" and "not." The problem, however, is that Boolean searching presupposes that users know exactly what they are looking for because it relies on exact terminology. For example, searching for material on "acceptance of goods" will not retrieve documents that employ the term "taking delivery of goods". Unless the material within the computer is "marked" in a certain way, Boolean searching does not take into account synonyms.
Lawyers are accustomed to terminology derived from their domestic laws. Yet, a lawyer using domestic terminology to obtain information on international concepts may not retrieve information he or she is seeking (e.g., the phrase "rescission of the contract" may not retrieve CISG cases reflecting "avoidance of the contract"). Moreover, since many persons are not familiar with the range of commands that retrieval systems provide to refine searches (e.g., nested Boolean searching or obtaining relevancy rankings), document retrieval by laypersons can have minimal and inconsistent results.
Web pages that include a Boolean search option, usually also provide the contents of the site in a list; sub-categories may be displayed after the top-level heading is "clicked." The lists frequently give only the title of the item of information, however, and not the category or term for the legal concept that the work represents. Therefore, this search method may really be useful only when the user knows the precise title of the document he or she is looking for.
(3) The third retrieval system we have today in international sales law moves away from traditional research methodologies and relies more heavily on the substantive content of the law as a means for the structure of its classification system. It has taken two forms. Both are organized under a provision of a law. Taking the CISG as an example, the first organizes its documents by legal issue, the second also by the use of an "Annotated Text Page."
(a) UNILEX uses a system that organizes CISG cases under each CISG Article pursuant to a list of legal issues that could arise under that Article.
(b) The Institute of International Commercial Law of the Pace University School of Law
Both forms of the third retrieval system take closer strides towards the creation of the architecture necessary for information retrieval; however, neither goes far enough to offer a system that will aid sufficiently in the uniform conceptualization of all international sales law - not yet at least.
All of the aforementioned methods assume that the user has a sophisticated level of knowledge in researching international topics. Most lawyers do not have this knowledge. Collectively, the information is too scattered among the different resources; individually, none of the current resources has established a system adequate to ensure quick, thorough retrieval results when the number of CISG cases and commentaries grows into the tens of thousands. Moreover, foreign case law that is provided through these sources is often written in a language unknown to the reader.
A uniform system for information retrieval should be created to provide a framework for how the law itself should be viewed. Additionally, a systematic, comprehensive case translation program must coexist with the information retrieval system to ensure that information that is retrieved can actually be utilized.
This goal begs the next question: What should the blueprints for the creation of a uniform system of information retrieval look like? What are the first realistic steps that should be taken towards establishing a framework for international sales law? As was the situation over a century ago in the US, the answer is found in library science methodologies, specifically, the creation of an information retrieval thesaurus.
A uniform system for information retrieval would help achieve a more consistent application of international sales law. The term "uniform system" suggests that in different media - print or computer-based - legal concepts would be indexed using the same controlled terminology. Ideally, all information sources would be merged to provide a "one-stop shop" for international sales law. Until that goal is realized, consistent, uniform classification of information in the various sources is the next best practical step.
There are two tools that could be used for the creation of a uniform language for the classification of information - classification schema, and information retrieval thesauri.
(1) The first option, a classification schema, assigns numbers to categories of information. Subjects are then classified by number (e.g., the Dewey Decimal System). This system does not provide the structure necessary to create an autonomous international vocabulary. Creating categories for legal topics could be a respectable beginning, but it will ultimately be a flawed route to the control of information because it allows too many domestic law ideas to be pushed into broad categories.
(2) The second option, the creation of an information- retrieval thesaurus for international sales law, is the most effective tool for the organization of materials in this field of law. Unlike either the UNCITRAL Thesaurus on the CISG,
An information retrieval thesaurus can be created using either a deductive method (terms are extracted from documents, but no control over the terms is made until enough terms are gathered, and then relationships are assigned) or through an inductive method (terms are selected as they are encountered in documents; vocabulary control and relationships are applied at the outset).
For the creation of an international-sales-law thesaurus, an inductive method should be applied to immediately delineate domestic terms from international terms and select preferred descriptors. The scope of this thesaurus is international sales law, the range of its domain can therefore vary based on subjective definitions of this field. Generally, we can commence assembling descriptors for this thesaurus by deriving them from the CISG, UNIDROIT Principles, Principles of European Contract Law (PECL), lex mercatoria, case law, and scholarly commentaries on them, arbitration rules (inter alia, institutional rules and the UN Model Law on International Arbitration) and Incoterms. Reference materials that are released by the United [page 241] Nations and other organizations and associations should also be incorporated. For example, the United Nations has published an "International Trade Law Terminology" in three languages, and the International Chamber of Commerce provides a book of "Key Words in International Trade" with terminology represented in five languages.
It is not necessary that every commentary on international sales law (ranging in the thousands) be consulted in the creation of this thesaurus. Rather, "key" books and articles should be referred to initially. Descriptors can be modified later, or new descriptors added based on the terms discovered through further research and indexing - the thesaurus is alive; it can always be modified to reflect new legal thoughts. Moreover, since this is a list reflecting international terminology, it should be annotated so that different jurisdictions can be assured that it reflects a balance of sources from different countries and legal cultures.
Although the thesaurus is premised on the idea of extracting terms from international sales law and then applying its terms to classify this law, the imputation of commercial law terms from domestic laws should not be precluded. This feature will only enhance the influence the thesaurus could have on the goal of an autonomous interpretation of international sales law generally. By way of illustration: in the United States a person conducting research on international sales law who is not familiar with its domain would likely use the terminology from Article 2 of the UCC in that person's search. If the thesaurus includes terminology from the UCC, but directs the user to terms which represent parallel legal concepts in international sales law, the researcher is more likely to get all the information needed and is no longer relying on domestic law to find the answer to international legal questions. The incorporation of domestic laws into the structure will impact the substantive development of the law
One of the unique attributes of the information retrieval thesaurus is that it establishes relationships among the terms. The relationships have the ability to control the terms that will denote legal concepts and also place each term within a framework delineating its position in the hierarchy of all of the other descriptors representing legal concepts. [page 242]
a. Equivalence relationship
Entry Term - rescission of contract
Use - avoidance of contract
Descriptor - avoidance of contract
Used for - termination of contract
rescission of contract
renunciation of contract
repudiation of contract
cancellation of contract
cancellation of contract
(in its multilingual form could also direct the user, for example, from the German equivalent of "avoidance of contract" (Rücktritt) to the English term, which would lead to all the information on the subject regardless of the language).
A further example:
Entry Term - PECL
Use - Principles of European Contract Law
Descriptor - Principles of European Contract Law
Used for (UF) - PECL
b. Hierarchical relationship
Descriptor - damages
Broader Term (BT) - remedies
Narrower Term (NT) - consequential damages
incidental damages [page 243]
c. Associative relationship
Descriptor - damages
Related Terms (RT) - calculation of damages
mitigation of damages
reduction in damages
proof of damages
Relationships such as these are explained in the Addendum to this paper and defined and explained further in the ANSI/NISO Guidelines for the Construction, Format and Management of Monolingual Thesauri.
English is today the most popular language for writings on international sales law. An international sales thesaurus may therefore commence with the English language; however, the mechanisms used to organize this field of law should not be reliant solely on the terminology of one language. Applying relevant thesaurus standards, the information retrieval thesaurus should also include various languages to effectively incorporate materials from around the world into the framework. The International Standards Organization has established a standard for the creation of a multilingual thesaurus.
Software invented for the creation of information retrieval thesauri permits the creator of a thesaurus to generate broad subject categories for its terms. These categories could be most effectively used in this domain by assigning terms to specific international law instruments (or, more specifically, the Article numbers within the law), e.g, the CISG, UNIDROIT Principles or PECL. Similar categories can be created for terms derived from Arbitral Associations or [page 244] Incoterms. By creating relationships between terms and assigning them to subject categories, the thesaurus designer provides a multitude of possibilities for the creation of search mechanisms, for manipulating the presentation of information based on users' needs in the confines of a uniform terminology.
A thesaurus is the first, but essential, step in the creation of a uniform framework for the conceptualization of international sales law. The thesaurus becomes most useful when case law, scholarly commentaries and legislative history materials are indexed together for information retrieval. All these documents can be classified under descriptors from the thesaurus, with different descriptors assigned to different legal instruments as appropriate; those descriptors are then used in the index. For example, because the CISG and the UNIDROIT Principles and PECL assign different meanings to the terms "avoidance" and "termination":
The marvel of a thesaurus is that it can ensure that all information is "tagged" using the same terms, which will have the implicit effect of teaching lawyers to associate and categorize particular terms with either their domestic law or a particular international legal instrument. Consider the comments of Daniel P. Dabney in his article, The Curse of Thamus: An Analysis of Full-Text Legal Document Retrieval:
"Another effect of subject authority control [thesaurus control] in indexing may be an influence on the substantive development of the subject of the collection. For example, some of the terms that might be used as subject headings have connotations that implicitly comment on the subject matter so indexed. Consider, for example, that generations of lawyers and judges have found law relating to employment relations under the heading "Master and Servant." This subject heading no doubt seemed reasonable to the legal community of the turn of the century when the heading was incorporated into the West key number system. A different segment of the society of that period might have found it reasonable to put such material under the heading "Toiler and Leech," and colored fruitful perception of the topic in a different way. "Toiler and Leech" seems outrageous to us; "Master and Servant" seems merely archaic, but this is to a large extent the effect of familiarity. . . . The precoordination [page 245] of subject headings in a thesaurus also may affect the development of the literature by making it appear that certain ideas go together and others do not."
This quote is not only an indication of the influence that a thesaurus can have on the perception, growth and development of concepts in the law, but further serves as a warning to the international sales law community as it works to create an information retrieval system. A methodology that is created to classify information must maintain a high level of flexibility to ensure that new legal thoughts are not recycled into archaic classification schemes. Descriptors should be periodically reviewed by practitioners and academics within this area of law to ensure that the terms are representative of current legal concepts, and are not, in effect, hindering the progression of the law.
It is now time to index all international sales law based on a uniform terminology derived from a suitable information retrieval thesaurus to influence the substantive development of the subject, so that courts and arbitral tribunals will place certain legal ideas together (international) and keep others apart (domestic and international).
"[A] revolution in legal research is taking place right now because of a technological change. . . . With computers, researchers can formulate their own word searches rather than rely entirely on the predetermined indexing of a digest."
Free-text searching and Boolean logic are tools used in the context of computer-based searching. "Full-text searching enables a researcher to search for every occurrence in the database of any word or combination of words without a pre-existing index."
Whether a supporter or critic of Boolean or free-text searching, neither approach should be considered the last and most effective tool for creating a uniform information retrieval methodology for international sales law. Free-text searching assumes a certain level of knowledge with respect to the terminology that must be used in the search. As mentioned supra, in most applications it has not been made to handle synonyms nor consider the legal background of the user (possibly using domestic terminology familiar to him or her).
For international sales law, an index (based on the terms in the thesaurus) should be incorporated into search interfaces to allow the user to see and utilize the framework that has been created for the law.
"If men learn this, it will implant forgetfulness in their souls; they will cease to exercise memory because they rely on that which is written, calling things to remembrance no longer from within themselves, but by means of external marks. What you have discovered is a recipe not for memory, but for reminder. And it is no true wisdom that you offer your disciples, but only is semblance, for by telling them of many things without teaching them you will make them seem to know much, while for the most part they know nothing ..." Phaedrus 275 a-b.
If the conclusion of this story is correct, and we do not possess knowledge internally, but must seek knowledge from the writings we retrieve, Plato should have continued the conversation between Socrates and Phaedrus to evaluate the systems that should be created to access the knowledge that one is seeking (e.g., for international commercial law: international codes, case law, scholarly commentaries, legislative history). The story should have also analyzed the impact that the research tools used to access the writings would have on the manner that we conceptualize the writings we uncover.
For a more modern view similar to Thamus', see comments by another state leader: "Much reading is an oppression of the mind and extinguishes the natural candle." William Penn quoted in Daniel Akst, On the Contrary: A Corner Office Has Little Room for Books, N.Y.Times, July 1, 2001, Business, at 4.
"LEXIS and WESTLAW have begun to develop concept-based systems and have introduced 'natural language' search interfaces as a step in this direction. We now have Freestyle and WIN, respectively. Natural language moves towards a conceptual search system, with a list of thousands of commonly used legal phrases indexed in addition to words. But natural language requires a complex search interface, which substitutes a series of mechanical judgments for our decision-making process. The computer program 'identifies' the 'concepts,' which are basically nouns or legal phrases, in the search request, and matches them against its inventory of words and legal phrases. The program identifies other documents with the same concepts and ranks its findings by statistical relevance - primarily by the number of times the concept occurs and how close to the beginning of the document it first occurs.
Like other computer searches, sometimes the results of natural-language searches are extraordinary, and sometimes they are worthless; usually they are somewhere in between. In any event, your ability to think in computerese and the underlying logic of the computer program determines the outcome of your research. This isn't the bias-free, untouched-by-human-hands results we expect of a computer, for many decisions are made for you by the computer program. Furthermore, many programmers are convinced that a better search, even for conceptual information, can be crafted using the Boolean techniques. One developer of CD-ROM-based legal materials stated that natural-language searching compared to Boolean searching is like using an automatic transmission versus a stick shift. 'You don't need to know anything about transmissions to drive an automatic, but all the race cars have stick shifts.'" Russ Armstrong, CD-ROM v. Law Books, Law-Lib Discussion List (Jan. 8, 1996) email at ‹email@example.com› Blintiff at 347.