= What is your best experience with the Internet?
The day I won a box of Swiss chocolates on the Health On the Net site. But don't rush to this site, the game doesn't exist any more.
= And your worst experience?
The abuse of e-mail: bad-mannered people take advantage of the distance and relative anonymity to say not very nice things and take really juvenile attitudes with, alas, consequences which are not always the kind you find in a children's world. For example, I once forwarded an email to somebody I thought would be interested in the subject and the person wrote directly to the original sender and discredited me.
CATHERINE DOMAIN (Paris)
#Founder of the Ulysses Bookstore (Librairie Ulysse), the oldest travel bookstore in the world
Located in central Paris, on the Ile Saint-Louis in the middle of the river
Seine, Librairie Ulysse is the oldest travel bookstore in the world and has more
than 20,000 books, maps and magazines, out of print and new, including some in
English, about all countries and all kinds of travel. It was set up in 1971 by
Catherine Domain, a member of the French National Union of Antiquarian and
Modern Bookstores (Syndicat national de la librairie ancienne et moderne
(SLAM)), the Explorers' Club (Club des Explorateurs) and the International Club
of Long-Distance Travelers (Club international des grands voyageurs).
Catherine has travelled all over the world for many years, visiting 136 countries, and she is still on the move. In 1998 she went sailing in Kiribati and the Marshall Islands in the the Pacific. In 1999, as a judge in the Island Book Prize (Prix du livre insulaire) contest, she visited the French island of Ushant. She also sailed around Sardinia in September.
*Interview of December 4, 1999 (original interview in French)
= Can you tell us about your website?
My site is still pretty basic and under construction. Like my bookstore, it's a place to meet people before being a place of business.
= How did using the Internet change your professional life?
The Internet is a pain in the neck, takes a lot of my time and I earn hardly any money from it, but that doesn't worry me…
= How do you see the future?
I'm very pessimistic, because it's killing off specialist bookstores.
= What do you think of the debate about copyright on the Web?
I must say I'm more concerned about the WTO (World Trade Organization) than about copyright.
= How do you see the growth of a multilingual Web?
Isn't it already multilingual? I think it's going to kill the French language as well as many others.
= What is your best experience with the Internet?
A daily chat with my sister who lives in Sri Lanka and the friends I have in Mexico, the USA, the UK, South Africa etc., because I've travelled a lot, for long periods all over the world.
= And your worst experience?
My first year with a computer and the Internet. It was one long technical agony!
HELEN DRY (Michigan)
#Moderator of The Linguist List
The website of The Linguist List gives an extensive series of links on linguistic resources: the profession (conferences, linguistic associations, programs, etc.); research and research support (papers, dissertation abstracts, projects, bibliographies, topics, texts); publications; pedagogy; language resources (languages, language families, dictionaries, regional information); and computer support (fonts and software).
The Linguist List is moderated by Helen Dry (Eastern Michigan University),
Anthony Aristar (Wayne State University) and Andrew Carnie (University of
Arizona). Helen Dry, who is interviewed here, is a professor of linguistics at
Eastern Michigan University. Her major research interests are linguistic
stylistics, corpus linguistics, pragmatics, and discourse analysis.
*Interview of August 18, 1998
= Is The Linguist List multilingual?
The Linguist List, which I moderate, has a policy of posting in any language, since it is a list for linguists. However, we discourage posting the same message in several languages, simply because of the burden extra messages put on our editorial staff. (We are not a bounce-back list, but a moderated one. So each message is organized into an issue with like messages by our student editors before it is posted.) Our experience has been that almost everyone chooses to post in English. But we do link to a translation facility that will present our pages in any of 5 languages; so a subscriber need not read Linguist in English unless s/he wishes to. We also try to have at least one student editor who is genuinely multilingual, so that readers can correspond with us in languages other than English.
*Interview of July 26, 1999
= What has happened since our last interview?
We are beginning to collect some primary data. For example, we have searchable databases of dissertation abstracts relevant to linguistics, of information on graduate and undergraduate linguistics programs, and of professional information about individual linguists. The dissertation abstracts collection is, to my knowledge, the only freely available electronic compilation in existence.
BILL DUNLAP (Paris & San Francisco)
#Founder of Global Reach, a methodology for companies to expand their Internet presence through a multilingual website
Founder of Global Reach, Bill Dunlap specialized in international online marketing and e-commerce among mainly American companies. Global Reach is a methodology for companies to expand their Internet presence into a more international framework. This includes translating a website into other languages and actively promoting it, to increase local website traffic from countries by a promotional campaign.
Bill Dunlap, an MIT (Massachusetts Institute of Technology) graduate, has made a life of bringing high-tech products and services to the international markets. When the microcomputer industry was in its early stages in the early 1980s, he set up a company to export popular Apple and PC software to top European markets. This led to a thorough familiarity with the European PC distribution business, and he worked then as AST Research's first European sales manager. Further opportunity brought him into Compaq Computer's newly established Paris office, where he became Compaq's first sales manager in France. He continued with Compaq afterwards at their European headquarters in Munich and managed Scandinavian sales.
Since the mid-1980s, Bill Dunlap has developed the international marketing consultancy Euro-Marketing Associates from Paris and San Francisco. In 1995, Euro-Marketing Associates was restructured into a virtual consultancy called Global Reach, a group of top online marketers throughout the world. The goal is to promote clients' websites in each targeted country, thus attracting more online traffic: more traffic, more sales.
*Interview of December 11, 1998
= How did using the Internet change your professional life?
Since 1981, when my professional life started, I've been involved with bringing American companies to Europe. This is very much an issue of language, since the products and their marketing have to be in the languages of Europe in order for them to be visible here. Since the Web became popular in 1995 or so, I've turned these activities to their online dimension, and have come to champion European e-commerce among my fellow American compatriates. Most lately at Internet World in New York, I spoke about European e-commerce and how to use a Website to address the various markets in Europe.
= What is the purpose of the Global Reach program?
Promoting your Web site is at least as important as creating it, if not more important. You should be prepared to spend at least as much time and money in promoting your Web site as you did in creating it in the first place. With the Global Reach program, you can have it promoted in countries where English is not spoken, and achieve a wider audience… and more sales. There are many good reasons for taking the online international market seriously. Global Reach is a means for you to extend your Web site to many countries, speak to online visitors in their own language and reach online markets there.
= How do you see the growth of a multilingual Web?
There are so few people in the U.S. interested in communicating in many languages — most Americans are still under the delusion that the rest of the world speaks English. However, here in Europe (I'm writing from France), the countries are small enough so that an international perspective has been necessary for centuries.
*Interview of July 23, 1999
= What practical suggestions do you have for the development of a multilingual website?
After a website's home page is available in several languages, the next step is the development of content in each language. A webmaster will notice which languages draw more visitors (and sales) than others, and these are the places to start in a multilingual Web promotion campaign. At the same time, it is always good to increase the number of languages available on a website: just a home page translated into other languages would do for a start, before it becomes obvious that more should be done to develop a certain language branch on a website.
= What is your best experience with the Internet?
Working in tandem with hundreds of people, without any pressure. It's a great life.
= And your worst experience?
Several times, I've published an online forum, in which several insulting individuals started sending nasty mail to the forum. It went out to hundreds of people, and then they started sending nasty mail back. It had a snowball effect, and I remember waking up one morning with over 4,000 messages to download. What a mess!
JACQUES GAUCHEY (San Franscico)
#Specialist in the information technology industry, "facilitator" between the
United States and Europe, and journalist
Created in 1993, Jacques Gauchey's consultancy G.a Communications assists start-up Internet and IT (information technology) companies in building their European strategies, partnerships, and visibility. To fulfill its clients' international business development needs, G.a Communications maintains a close-knit network of competences worldwide.
Jacques Gauchey was a director of the Multimedia Development Group (MDG) in 1996-97. He led MDG's International Group from 1994 to 1996, with projects ranging from MDG's M3 conference (1994) to publishing the 1995 and 1996 editions of the guide Going Global: Multimedia Marketing & Distribution.
He was a moderator at such events as the European ETRE & Asian ATRE only-for-CEOs IT conferences (1990, '91 & '92), MDG's "World Multimedia: A Mosaic of Markets" (San Francisco, 1994), Multimedia Live! (San Francisco, 1995), the A.I. (Artificial Intelligence) Soft International Partners seminar (Tokyo, 1996), etc. He moderates focus groups for the IT industry.
From 1985 to 1992, he was the West Coast correspondent for La Tribune, a Paris business daily. He worked previously for Le Figaro and Le Point.
*Interview of July 31, 1999 (original interview in French)
= How did using the Internet change your professional life?
Totally. The whole world is on my computer screen. Everyone now has access to a global database. They have to learn to navigate their way through it or get drowned.
= How do you see the future?
All my clients now are Internet companies. All my working tools (my mobile phone, my PDA and my PC) are or will soon be linked to the Internet.
= What do you think of the debate about copyright on the Web?
Copyright in its traditional context doesn't exist any more. Authors have to get used to a new situation: the total freedom of the flow of information. The original content is like a fingerprint: it can't be copied. So it will survive and flourish.
= How do you see the growth of a multilingual Web?
Technology may solve the problem. May the best one win. The Internet really took off in the US because of a revolutionary concept: only one language — English. The "politically correct" movement for mandatory multilingual teaching in US schools and respect for the various subcultures is a disaster for the future of this country (as it already is in Europe). Individuals have to decide at home if they want to learn another language.
= What is your best experience with the Internet?
Four years ago I published a few issues of a free English newsletter on the Internet. It had about 10 readers per issue until the day (in January 1996) when the electronic version of Wired Magazine created a link to it. In one week I got about 100 e-mails, some from French readers of my book La vallée du risque - Silicon Valley (published by Plon, Paris, at the end of 1990), who were happy to find me again.
= And your worst experience?
The Internet is a medium and, like any medium, can be lead to evil. The shooting spree by a day trader in Atlanta in July 1999. Pornography. The unrestricted online sale of guns. Junk mail.
MARCEL GRANGIER (Bern)
#Head of the French Section of the Swiss Federal Government's Central Linguistic
Services
*Interview of January 14, 1999 (original interview in French)
= How did using the Internet change your professional life?
To work without the Internet is simply impossible now. Apart from all the tools used (e-mail, the electronic press, services for translators), the Internet is for us a vital and endless source of information in what I'd call the "non-structured sector" of the Web. For example, when the answer to a translation problem can't be found on websites presenting information in an organized way, in most cases search engines allow us to find the missing link somewhere on the network.
= How do you see the growth of a multilingual Web?
We can see multilingualism on the Internet as a happy and irreversible inevitability. So we have to laugh at the doomsayers who only complain about the supremacy of English. Such supremacy isn't wrong in itself, because it's mainly based on statistics (more PCs per inhabitant, more people speaking English, etc.). The answer isn't to "fight English," much less whine about it, but to build more sites in other languages. As a translation service, we also recommend that websites be multilingual.
= How do you see the future?
The increasing number of languages on the Internet is inevitable and can only boost multicultural exchanges. For this to happen in the best possible circumstances, we still need to develop tools to improve compatibility. Fully coping with accents and other characters is only one example of what can be done.
*Interview of January 25, 2000 (original interview in French)
= Can you tell us about your website?
Our website was first conceived as an Intranet service for translators in Switzerland, who often deal with the same kind of material as the federal government's translators. Some parts of it are useful to any translators, wherever they are. The electronic dictionaries (Dictionnaires électroniques) are only one section of the website. Other sections deal with administration, law, the French language and general information. The site also hosts the pages of the Conference of Translation Services of European States (COTSOES).
= What exactly is your professional activity?
I'm head of the French Section of the Swiss Federal Government's Central Linguistic Services, which means I'm in charge of organising translation matters for all the linguistic services of the Swiss government.
= What do you think of the debate about copyright on the Web?
There's a problem here and the solution isn't obvious. It's a pity the battle against this kind of fraud will eventually justify, along with other abuses, a "Web police," which sadly is very far from the spirit in which the Web was created.
= How do you see the growth of a multilingual Web?
We now have a multilingual Internet. We have to build it up and ensure it's easy to access, which'll probably take a bit longer.
BARBARA GRIMES (Hawaii)
#Editor of Ethnologue: Languages of the World
The Ethnologue is a catalogue of more than 6,700 languages. A paper version and a CD-ROM are also available.
*Interview of August 18, 1998
= How did using the Internet change your professional life?
We have found the Internet to be useful, convenient, and supplementary to our work. Our main use of it is for e-mail. It is a convenient means of making information more widely available to a wider audience than the printed Ethnologue provides.
On the other hand, many people in the audience we wish to reach do not have access to computers, so in some ways the Ethnologue on the Internet reaches a limited audience who own computers. I am particularly thinking of people in the so-called "third world".
= How do you see the growth of a multilingual Web?
Multilingual web pages are more widely useful, but much more costly to maintain. We have had requests for the Ethnologue in a few other languages, but we do not have the personnel or funds to do the translation or maintenance, since it is constantly being updated.
*Interview of January 15, 2000
= Can you tell us about the Ethnologue?
It is a catalog of the languages of the world, with information about where they are spoken, an estimate of the number of speakers, what language family they are in, alternate names, names of dialects, other sociolinguistic and demographic information, dates of published Bibles, a name index, a language family index, and language maps.
= What exactly is your professional activity?
I am the editor of the 8th to 14th editions, 1971-2000.
= What do you think of the debate about copyright on the Web?
Any copyrights should be respected, just as with print matter.
= What is your best experience with the Internet?
Receiving corrections and new reliable information.
= And your worst experience?
Unkind criticism or that which does not include corrections.
MICHAEL HART (Illinois)
#Founder of Project Gutenberg, the oldest digital library on the Internet
Project Gutenberg, set up by Michael Hart in 1971 when he was a student at the University of Illinois (USA), was the Internet's first information provider. From the beginning, its mission has been to put at everybody's disposal, free, as many books as possible whose copyright has expired. It is now the biggest digital library on the Web in terms of the number of books (3,700 e-texts in July 2001) that have been patiently digitized in text format by 600 volunteers from all over the world. Some old documents are typed line by line, mainly because the originals are unclear, but most works are scanned using OCR (optical character recognition) software. Then they are read and corrected twice, sometimes by two different people. At first they were just books in English, but now ones in other languages are being digitized.
*Interview of August 23, 1998
= How do you see the relationship between the print media and the Internet?
We consider e-text to be a new medium, with no real relationship to paper, other than presenting the same material, but I don't see how paper can possibly compete once people each find their own comfortable way to e-texts, especially in schools.
= How did using the Internet change your professional life?
My career couldn't have happened without the Internet, and neither could Project Gutenberg have happened. I presume you know that Project Gutenberg was the first information provider on the Net.
= What are your new projects?
My own personal goal is to put 10,000 Etexts on the Net, and if I can get some major support, I would like to expand that to 1,000,000 and to also expand our potential audience for the average Etext from 1.x% of the world population to over 10%, thus changing our goal from giving away 1,000,000,000,000 Etexts to 1,000 time as many, a trillion and a quadrillion in US terminology.
*Interview of July 23, 1999
= What do you think of the debate about copyright on the Web?
The kind of copyright debate going on is totally impractical. It is run by and for the "Landed Gentry of the Information Age." Information Age? For whom? No one has said more against copyright extensions that I have, but Hollywood and the big publishers have seen to it that our Congress won't even mention it in public.
= What are exactly these copyright extensions?
Nothing will expire for another 20 years. We used to have to wait 75 years. Now it is 95 years. And it was 28 years (+ a possible 28 year extension, only on request before that) and 14 years (+ a possible 14 year extension before that). So, as you can see, this is a serious degrading of the public domain, as a matter of continuing policy.
= How do you see the growth of a multilingual Web?
We will eventually have a really good Babelfish (AltaVista's translation software). I am publishing in one new language per month right now, and will continue as long as possible.
= What is your best experience with the Internet?
The notes I get that tell me people appreciate that I have spent my life putting books, etc., on the Internet. Some are quite touching, and can make my whole day.
= And your worst experience?
Getting called on the Chancellor's carpet because Oxford University call him and really shook him up… but I had a team of 6 lawyers, half from the University of Illinois, who backed me up, so we made Oxford back down. You might say that was a good memory, but I hate that kind of politicking… the Chancellor was Tom Cruise's uncle, so that was fun.
ROBERTO HERNANDEZ MONTOYA (Caracas)
#Head of the digital library of the electronic magazine Venezuela Analítica
Roberto Hernández Montoya has a literature degree from the Central University of
Venezuela. He is a columnist at El Nacional, Letras, Imagen and Internet World
Venezuela. He is a member of the editorial board of Venezuela Cultural,
Venezuela Analítica and Imagen. He studied discourse analysis at the School of
High Studies in Social Sciences (Ecole des hautes études en sciences sociales -
EHESS), Paris. He was the founding president of the Venezuelan Association of
Editors, and the editor of the Ateneo de Caracas.
Venezuela Analítica, an electronic magazine conceived as a public forum to exchange ideas on politics, economics, culture, science and technology, created in May 1997 BitBlioteca, a digital library which contains material mostly in Spanish, and also in French, English and Portuguese.
*Interview of September 3, 1998 (original interview in French)
= How do you see the relationship between the print media and the Internet?
The printed word can't be replaced, at least not in the foreseeable future. The paper book is a wonderful thing. We can't leaf through an electronic text in the same way. But we can find words and groups of words much more quickly. We can read an electronic text more carefully, even with the inconvenience of reading it on the screen. It is less expensive and can be more easily distributed worldwide (not counting the cost of the computer and Internet connection).
= How did using the Internet change your professional life?
The Internet has been personally very important for me. It's become the centre of my life. It's meant that our organization can now communicate with thousands of people — something we couldn't have afforded if we'd published a paper magazine. I think the Internet is going to be the chief means of communication and exchanging information in the future.
RANDY HOBLER (Dobbs Ferry, New York)
#Internet Marketing Consultant, among others at Globalink, a company specialized in language translation software and services
Randy Hobler has been a consultant in Internet& marketing at IBM, Johnson & Johnson, Burroughs Wellcome, Pepsi, Heublein, etc. In 1998, he was an Internet Marketing Consultant for Globalink, a company specialized in language translation software and services. He wrote: "The joy for me is the ability to combine my vocational skills in high-tech and marketing with avocational interests like language into one. To love what you do and do what you love." Globalink was bought by Lernout & Hauspie in 1999.
*Interview of September 3, 1998
= How do you see the growth of a multilingual Web?
85% of the content of the Web in 1998 is in English and going down. This trend is driven not only by more websites and users in non-English-speaking countries, but by increasing localization of company and organization sites, and increasing use of machine translation to/from various languages to translate websites.
Because the Internet has no national boundaries, the organization of users is bounded by other criteria driven by the medium itself. In terms of multilingualism, you have virtual communities, for example, of what I call "Language Nations"… all those people on the Internet wherever they may be, for whom a given language is their native language. Thus, the Spanish Language nation includes not only Spanish and Latin American users, but millions of Hispanic users in the US, as well as odd places like Spanish-speaking Morocco.
= Can you tell us about the future of machine translation?
We are rapidly reaching the point where highly accurate machine translation of text and speech will be so common as to be embedded in computer platforms, and even in chips in various ways. At that point, and as the growth of the Web slows, the accuracy of language translation hits 98% plus, and the saturation of language pairs has covered the vast majority of the market, language transparency (any-language-to-any-language communication) will be too limiting a vision for those selling this technology. The next development will be "transcultural, transnational transparency", in which other aspects of human communication, commerce and transactions beyond language alone will come into play. For example, gesture has meaning, facial movement has meaning and this varies among societies. The thumb-index finger circle means 'OK' in the United States. In Argentina, it is an obscene gesture.
When the inevitable growth of multi-media, multi-lingual videoconferencing comes about, it will be necessary to 'visually edit' gestures on the fly. The MIT (Massachussets Institute of Technology) Media Lab, Microsoft and many others are working on computer recognition of facial expressions, biometric access identification via the face, etc. It won't be any good for a US business person to be making a great point in a Web-based multi-lingual video conference to an Argentinian, having his words translated into perfect Argentinian Spanish if he makes the "O" gesture at the same time. Computers can intercept this kind of thing and edit them on the fly.
There are thousands of ways in which cultures and countries differ, and most of these are computerizable to change as one goes from one culture to the other. They include laws, customs, business practices, ethics, currency conversions, clothing size differences, metric versus English system differences, etc. Enterprising companies will be capturing and programming these differences and selling products and services to help the peoples of the world communicate better. Once this kind of thing is widespread, it will truly contribute to international understanding.
*Interview of September 10, 2000
= What do you think about e-books?
E-books continue to grow as the display technology improves, and as the hardware becomes more physically flexible and lighter. Plus, among the early adapters will be colleges because of the many advantages for students (ability to download all their reading for the entire semester, inexpensiveness, linking into exams, assignments, need for portability, eliminating need to lug books all over).
EDUARD HOVY (Marina del Rey, California)
#Head of the Natural Language Group at USC/ISI (University of Southern
California / Information Sciences Institute)
The Natural Language Group (NLG) at the Information Sciences Institute of the University of Southern California (USC/ISI) is currently involved in various aspects of computational/natural language processing. The group's projects are: machine translation; automated text summarization; multilingual verb access and text management; development of large concept taxonomies (ontologies); discourse and text generation; construction of large lexicons for various languages; and multimedia communication.
Eduard Hovy, his director, is a member of the Computer Science Departments of USC and of the University of Waterloo. He completed a Ph.D. in Computer Science (Artificial Intelligence) at Yale University in 1987. His research focuses on machine translation, automated text summarization, text planning and generation, and the semi-automated construction of large lexicons and terminology banks. The Natural Language Group at ISI currently has projects in most of these areas.
Dr. Hovy is the author or editor of four books and over 100 technical articles.
He currently serves as the President of the Association of Machine Translation
in the Americas (AMTA). He is Vice President of the Association for
Computational Linguistics (ACL), and has served on the editorial boards of
Computational Linguistics and the Journal of the Society of Natural Language
Processing of Japan.
*Interview of August 27, 1998
= How do you see the growth of a multilingual Web?
In the context of information retrieval (IR) and automated text summarization (SUM), multilingualism on the Web is another complexifying factor. People will write their own language for several reasons — convenience, secrecy, and local applicability — but that does not mean that other people are not interested in reading what they have to say! This is especially true for companies involved in technology watch (say, a computer company that wants to know, daily, all the Japanese newspaper and other articles that pertain to what they make) or some government intelligence agencies (the people who provide the most up-to-date information for use by your government officials in making policy, etc.). One of the main problems faced by these kinds of people is the flood of information, so they tend to hire "weak" bilinguals who can rapidly scan incoming text and throw out what is not relevant, giving the relevant stuff to professional translators. Obviously, a combination of SUM and MT (machine translation) will help here; since MT is slow, it helps if you can do SUM in the foreign language, and then just do a quick and dirty MT on the result, allowing either a human or an automated IR-based text classifier to decide whether to keep or reject the article.
For these kinds of reasons, the US Government has over the past five years been funding research in MT, SUM, and IR, and is interested in starting a new program of research in Multilingual IR. This way you will be able to one day open Netscape or Explorer or the like, type in your query in (say) English, and have the engine return texts in all the languages of the world. You will have them clustered by subarea, summarized by cluster, and the foreign summaries translated, all the kinds of things that you would like to have.
You can see a demo of our version of this capability, using English as the user language and a collection of approx. 5,000 texts of English, Japanese, Arabic, Spanish, and Indonesian, by visiting MuST (Multilingual information retrieval, summarization, and translation system).
Type your query word (say, "baby", or whatever you wish) in and press Enter/Return. In the middle window you will see the headlines (or just keywords, translated) of the retrieved documents. On the left you will see what language they are in: "Sp" for Spanish, "Id" for Indonesian, etc. Click on the number at left of each line to see the document in the bottom window. Click on "Summarize" to get a summary. Click on 'Translate' for a translation (but beware: Arabic and Japanese are extremely slow! Try Indonesian for a quick word-by-word "translation" instead).
This is not a product (yet); we have lots of research to do in order to improve the quality of each step. But it shows you the kind of direction we are heading in.
= How do you see the future?
The Internet is, as I see it, a fantastic gift to humanity. It is, as one of my graduate students recently said, the next step in the evolution of information access. A long time ago, information was transmitted orally only; you had to be face-to-face with the speaker. With the invention of writing, the time barrier broke down — you can still read Seneca and Moses. With the invention of the printing press, the access barrier was overcome — now anyone with money to buy a book can read Seneca and Moses. And today, information access becomes almost instantaneous, globally; you can read Seneca and Moses from your computer, without even knowing who they are or how to find out what they wrote; simply open AltaVista and search for "Seneca". This is a phenomenal leap in the development of connections between people and cultures. Look how today's Internet kids are incorporating the Web in their lives.
The next step? — I imagine it will be a combination of computer and cellular phone, allowing you as an individual to be connected to the Web wherever you are. All your diary, phone lists, grocery lists, homework, current reading, bills, communications, etc., plus AltaVista and the others, all accessible (by voice and small screen) via a small thing carried in your purse or on your belt. That means that the barrier between personal information (your phone lists and diary) and non-personal information (Seneca and Moses) will be overcome, so that you can get to both types anytime. I would love to have something that tells me, when next I am at a conference and someone steps up, smiling to say hello, who this person is, where last I met him/her, and what we said then!
But that is the future. Today, the Web has made big changes in the way I shop (I spent 20 minutes looking for plane routes for my next trip with a difficult transition on the Web, instead of waiting for my secretary to ask the travel agent, which takes a day). I look for information on anything I want to know about, instead of having to make a trip to the library and look through complicated indexes. I send e-mail to you about this question, at a time that is convenient for me, rather than your having to make a phone appointment and then us talking for 15 minutes. And so on.
*Interview of August 8, 1999
= What has happened since our first interview?
Over the past 12 months I have been contacted by a surprising number of new information technology (IT) companies and startups. Most of them plan to offer some variant of electronic commerce (online shopping, bartering, information gathering, etc.). Given the rather poor performance of current non-research level natural language processing technology (when is the last time you actually easily and accurately found a correct answer to a question to the Web, without having to spend too much time sifting through irrelevant information?), this is a bit surprising. But I think everyone feels that the new developments in automated text summarization, question analysis, and so on, are going to make a significant difference. I hope so!—but the level of performance is not available yet.
It seems to me that we will not get a big breakthrough, but we will get a somewhat acceptable level of performance, and then see slow but sure incremental improvement. The reason is that it is very hard to make your computer really "understand" what you mean—this requires us to build into the computer a network of "concepts" and their interrelationships that (at some level) mirror those in your own mind, at least in the subjects areas of interest. The surface (word) level is not adequate — when you type in "capital of Switzerland", current systems have no way of knowing whether you mean "capital city" or "financial capital". Yet the vast majority of people would choose the former reading, based on phrasing and on knowledge about what kinds of things one is likely to ask the Web, and in what way.
Several projects are now building, or proposing to build, such large "concept" networks. This is not something one can do in two years, and not something that has a correct result. We have to develop both the network and the techniques for building it semi-automatically and self-adaptively. This is a big challenge.
= What do you think about the debate concerning copyright on the Web? What practical solutions would you suggest?
As an academic, I am of course one of the parasites of society, and hence all in favor of free access to all information. But as a part-owner of a small startup company, I am aware of how much it costs to assemble and format information, and the need to charge somehow.
To balance these two wishes, I like the model by which raw information (and some "raw" resources, such as programming languages and basic access capabilities like the Web search engines) are made available for free. This creates a market and allows people to do at least something. But processed information, and the systems that help you get and structure just exactly what you need, I think should be paid for. That allows developers of new and better technology to be rewarded for their effort.
Take an example: a dictionary, today, is not free. Dictionary companies refuse to make them available to research groups and others for free, arguing that they have centuries of work invested. (I have had several discussions with dictionary companies on this.) But dictionaries today are stupid products — you have to know the word before you can find the word! I would love to have something that allows me to give an approximate meaning, or perhaps a sentence or two with a gap where I want the word I am looking for, or even the equivalent in another language, and returns the word(s) I am looking for. This is not hard to build, but you need the core dictionary to start with. I think we should have the core dictionary freely available, and pay for the engine (or the service) that allows you to enter partial or only somewhat accurate information and helps you find the best result.
A second example: you should have free access to all the Web, and to basic search engines like those available today. No copyrights, no license fees. But if you want an engine that provides a good targeted answer, pinpointed and evaluated for trustworthiness, then I think it is not unreasonable to pay for that.
Naturally, an encyclopedia builder will not like my proposal. But to him or her I say: package your encyclopedia inside a useful access system, because without it the raw information you provide is just more data, and can easily get lost in the sea of data available and growing every hour.
*Interview of September 2, 2000
= What has happened since our last interview?
I see a continued increase in small companies using language technology in one way or another: either to provide search, or translation, or reports, or some other communication function. The number of niches in which language technology can be applied continues to surprise me: from stock reports and updates to business-to-business communications to marketing…
With regard to research, the main breakthrough I see was led by a colleague at ISI (I am proud to say), Kevin Knight. A team of scientists and students last summer at Johns Hopkins University in Maryland developed a faster and otherwise improved version of a method originally developed (and kept proprietary) by IBM about 12 years ago. This method allows one to create a machine translation (MT) system automatically, as long as one gives it enough bilingual text. Essentially the method finds all correspondences in words and word positions across the two languages and then builds up large tables of rules for what gets translated to what, and how it is phrased.
Although the output quality is still low — no-one would consider this a final product, and no-one would use the translated output as is — the team built a (low-quality) Chinese-to-English MT system in 24 hours. That is a phenomenal feat — this has never been done before. (Of course, say the critics: you need something like 3 million sentence pairs, which you can only get from the parliaments of Canada, Hong Kong, or other bilingual countries; and of course, they say, the quality is low. But the fact is that more bilingual and semi-equivalent text is becoming available online every day, and the quality will keep improving to at least the current levels of MT engines built by hand. Of that I am certain.)
Other developments are less spectacular. There's a steady improvement in the performance of systems that can decide whether an ambiguous word such as "bat" means "flying mammal" or "sports tool" or "to hit"; there is solid work on cross-language information retrieval (which you will soon see in being able to find Chinese and French documents on the Web even though you type in English-only queries), and there is some rather rapid development of systems that answer simple questions automatically (rather like the popular web system AskJeeves, but this time done by computers, not humans). These systems refer to a large collection of text to find "factiods" (not opinions or causes or chains of events) in response to questions such as "what is the capital of Uganda?" or "how old is President Clinton?" or "who invented the xerox process?", and they do so rather better than I had expected.
= What do you think about e-books?
E-books, to me, are a non-starter. More even that seeing a concert live or a film at a cinema, I like the physical experience holding a book in my lap and enjoying its smell and feel and heft. Concerts on TV, films on TV, and e-books lose some of the experience; and with books particularly it is a loss I do not want to accept. After all, it's much easier and cheaper to get a book in my own purview than a concert or cinema. So I wish the e-book makers well, but I am happy with paper. And I don't think I will end up in the minority anytime soon — I am much less afraid of books vanishing than I once was of cinemas vanishing.
= What is your definition of cyberspace?
I define cyberspace as the totality of information that we can access via the Internet and computer systems in general. It is not, of course, a space, and it has interesting differences with libraries. For example, soon my fridge, my car, and I myself will be "known" to cyberspace, and anyone with the appropriate access permission (and interest) will be able to find out what exactly I have in my fridge and how fast my car is going (and how long before it needs new shock absorbers) and what I am looking at now. In fact, I expect that advertisements will change their language and perhaps even pictures and layout to suit my knowledge and tastes as I walk by, simply by recognizing that "here comes someone who speaks primarily English and lives in Los Angeles and makes $X per year". All this behaviour will be made possible by the dynamically updatable nature of cyberspace (in contrast to a library), and the fact that computer chips are still shrinking in size and in price. So just as today I walk around in "socialspace" — a web of social norms, expectation, and laws — tomorrow I will be walking around in an additional cyberspace of information that will support me (sometimes) and restrict me (other times) and delight me (I hope often) and frustrate me (I am sure).
= And your definition of the information society?
An information society is one in which people in general are aware of the importance of information as a commodity, and attach a price to it as a matter of course. Throughout history, some people have always understood how important information is, for their own benefit. But when the majority of society starts working with and on information per se, then the society can be called an information society. This may sound a bit vacuous or circularly defined, but I bet you that anthropologists can go and count what percentage of society was dedicated to information processing as a commodity in each society. Where they initially will find only teachers, rulers' councillors, and sages, they will in later societies find people like librarians, retired domain experts (consultants), and so on. The jumps in communication of information from oral to written to printed to electronic every time widened (in time and space) information dissemination, thereby making it less and less necessary to re-learn and re-do certain difficult things. In an ultimate information society, I suppose, you would state your goal and then the information agencies (both the cyberspace agents and the human experts) would conspire to bring you the means to achieve it, or to achieve it for you, minimizing the amount of work you'd have to do to only that is truly new or truly needs to be re-done with the material at hand.
CHRISTIANE JADELOT (Nancy, France)
#Researcher at the INALF (Institut national de la langue française - National
Institute of the French Language)
The purpose of the INaLF — part of the France's National Centre for Scientific Research (Centre national de la recherche scientifique, CNRS) — is to design research programmes on the French language, particularly its vocabulary. The INaLF's constantly expanding and revised data, processed by special computer systems, deal with all aspects of the French language: literary discourse (14th-20th centuries), everyday language (written and spoken), scientific and technical language (terminologies), and regional languages. This data, which is an very important study resource, is made available to people interested in the French language (teachers and researchers, business people, the service sector and the general public) through publications and databases.
Christiane Jadelot is an expert in computerized lexicography. She is currently in charge of putting the eighth version of the Dictionnaire de l'Académie française (Dictionary of the French Academy) (1932-1935) online.
*Interview of June 8, 1998 (original interview in French)
= What is the history of the INaLF website?
At the request of Robert Martin, the head of INaLF, our first pages were posted on the Internet in mid-1996. I helped set up these web pages with tools that cannot be compared to the ones we have nowadays. I was working with tools on Unix, which were not very easy to use. We had little practical experience then, and the pages were very cluttered. But the INaLF thought it was very important to make ourselves known through the Internet, which many firms were already using to sell their products. As we are a "research and services" organization, we have to find customers for our computer products, the best known being the text database Frantext. I think Frantext was already on the Internet (since early 1995), and there was also a draft version of volume 14 of the TLF (Trésor de la langue française). So we had to publicize INaLF activities in this way. It met a general need.
= How did using of the Internet change your professional life?
I began to really use it in 1994, with a browser called Mosaic. I found it a very useful way of improving my knowledge of computers, linguistics, literature… everything. I was finding the best and the worst, but as a discerning user, I had to sort it all out and make choices. I particularly liked the software for e-mail, file transfers and dial-up connections. At that time I had problems with a programme called Paradox and character sets that I couldn't use. I tried my luck and threw out a question in a specialist news group. I got answers from all over the world. Everyone seemed to want to solve my problem! I wasn't used to this kind of support. The French are more used to working alone, without reaching out.
= What do you see the future?
I think we have to equip more and more laboratories with high-tech hardware and software so we can use all these new media. We have got projects for schools and research centers. The French education ministry has promised to give all schools cable line access, which is a pressing national need. I saw a TV programme about a small rural primary school's experience of the Internet. The pupils were communicating by e-mail with schools all over the world. This is very enriching, especially when supervised by specially-trained teachers. So that is how I see the Internet. Now I am equipped at home, more for fun, and I hope to convince my daughter to use all these tools to the fullest.
*Interview of August 10, 1999 (original interview in French)
= What do you think of the debate about copyright on the Web?
With its text database Frantext, the INaLF is greatly affected by problems of copyright and publisher's rights. I think the rules should be more flexible. At the moment, use of the database is restricted, which reduces its influence and the spread of French in general.
= How do you see the growth of a multilingual Web?
Personally I have no problem about the use of English, which has to be regarded as a shared communication tool. But websites should offer access both in English and in the language of their country of origin.
= What is your best experience with the Internet?
It was the one I recalled in 1998, when I got responses from all over the world to my very trivial question about type-faces.
= And your worst experience?
When I sent an email to someone by mistake. Sometimes this communication tool has to be used carefully. It goes faster than the human brain and can then be used by the recipient in a very ugly way.
JEAN-PAUL (Paris)
#Webmaster of cotres furtifs (Furtive Cutter Ships), a website that tells stories in 3D
The cotres furtifs was launched on October 20, 1998, after they had become a group. Following a break to show solidarity with the Altern web server (which fell foul of the inadequate French laws about the Internet), they are now offering two parts and preparing a third. The aim is to tell stories in 3D and explore how a 'link' opens the way for 'hyperwriting,' which is a set of characters, sounds and animations. It gives priority to words.
Jean-Paul is a writer and a musician. In June 1998, he wrote: "The Internet allows me to do without intermediaries, such as record companies, publishers and distributors. Most of all, it allows me to crystallize what I have in my head (and elsewhere): the print medium (desktop-publishing, in fact) only allows me to partly do that. Then the intermediaries will take over and I'll have to look somewhere else, a place where the grass is greener…"
*Interview of August 5, 1999 (original interview in French)
= How do you see the future of cyber-literature?
The future of cyber-literature, techno-literature or whatever you want to call it, is set by the technology itself. It's now impossible for an author to handle all by himself the words and their movement and sound. A decade ago, you could know well each of Director, Photoshop or Cubase (to cite just the better-known software), using the first version of each. That's not possible any more. Now we have to know how to delegate, find more solid financial partners than Gallimard, and look in the direction of Hachette-Matra, Warner, the Pentagon and Hollywood.
At best, the status of the, what… hack? multimedia director? will be the one of video director, film director, the manager of the product. He or she's the one who receives the golden palms at Cannes, but who would never have been able to earn them just on their own. As twin sister (not a clone) of the cinematograph, cyber-literature (video + the link) will be an industry, with a few isolated craftsmen on the outer edge (and therefore with below-zero copyright).
= What exactly is a cutter?
It is called that because it seems to cut through the water. It's sturdy little naval vessel with a single mast. Cutters were an important part of naval fleets because they were quick and easy to operate. They were the favourite boats of pirates, smugglers and… maritime postal workers.
"Now that the earth is flat and the seas desalinated, it's time for our cutters to thread their way through the 6 billion (soon six and a half billion) stars that we are. And for them all to link up with each other." (The running cutter) Why do you use just your first name, instead of your full name?
My reasoning is that, on the Web, there's everything to be done. Except for CERN (European Center for Particule Research) and the Pentagon (which are going to make another web, designed just for their own use), nobody knows what exactly it offers us. So we can work freely while believing that probably everything is open. And use this unlimited, internal space as widely and quickly as possible before the rapacious star-spangled banners of 0 and 1 catch up with and overtake us.
But if it's just a matter of repeating the same things as before, what's the point?
This business of using a surname (directly linked to the copyright problem) takes us back to basics, to the central untouchable principle of our planet: private property. Within the space of a few centuries, we have been reduced to a name, just one name, all the "cleaner" because it has been stripped of all humanity and reduced to a social security barcode. It's not something natural, but a choice of the society, desired by managers. How could we run a modern society and give back to Caesar his due if each of us could change our administrative identity several times in our lives, from "Daredevil on Rollers" to "Motorcycle on the Curves" and then "Hippy Smoking on the Verandah" (you know, like me, that a simple software programme could easily take care of all this)? "Human nature is basically evil and all criminals take advantage of that. But we're here to protect you and your identity." (The Pentagon) And the first thing a down-and-out person does to assert themselves, someone whose papers are never in order, is to scribble their name on a billboard advertising some big commercial product.
On our site, we discreetly try something else.
We exist, we have an address. We know it's hard to speak to each other in anonymity or in a group, so we keep a few landmarks — the time factor, the human factor, and for the cutters, the cutter mailman, who happens to be Jean-Paul. A first name that is not really one's own name because the thing about a name is that it isn't ours, it's a name passed down by a dynasty, from a string of legally-registered names of our male ancestors.
But we're not rejecting our ancestors. They created our world, what we call reality. But we build up the Web to create another dream. And we launch our cutters in all directions, to make contacts.
= What do you think of the debate about copyright on the Web?
We don't feel involved.
a) If it means "respect", it's a matter of morality and style, so there's nothing to discuss. On the Web, as elsewhere, we quote our sources. Complete respect. For most of us.
b) If it means "copyright", we're on legal ground, which is by nature shaky. Copyright is a recent notion the French attribute to Beaumarchais, a business man with a dark side, an arms dealer and great writer. The advent of digitization, and therefore cloning (which raises a different problem to the one of copying, which was solved long ago), forces us to reconsider this notion.
c) If it means "author's rights" (in the plural), we're in the economic field, where we know what the attitude is: competition, withholding information, being top of the class and stopping others from getting there.
Sony publishes CD (audio and ROM) because it earns them good money. And it makes CD-engravers (which enable you to clone its own CDs, as well as those of its rivals) because it earns them more good money. Philips was doing the same thing until it sold its Polygram division (which, according to the rules of economics, it could buy back if it wanted).
"It's not enough to be big to be successful but, in a totally globalized financial world, it helps." (Hervé Babonneau, Ouest-France (French daily newspaper), August 6, 1999). "A funny aim", says the sturdy cutter. Jurassic Games and tyrannosaurus more or less rex.
Although it's marginally economic (we have to pay for a domain name and a subscription to the server), our cutter-space isn't limited to that and we don't have a competitive attitude. Our site can be freely downloaded, and we download sites we think are creative.
It's normal to clone someone else's work and give it away as a gift. It's a way to share. What's disgusting is to sell a clone.
The job of legal experts is to prove the authorities right: yesterday it was the guillotine for backstreet abortionists, today the social security reimburses the cost of abortions (in France, though not in Poland).
Copyright or author's rights, a European vision or a US one, which will prevail? The sacred principle of private property. The property of those who have the means to keep it. Through the World Trade Organization (WTO), for example, which is in charge of settling "rights" issues anywhere in the world (even the virtual world) and, they hope, permanently.
If your house is the path of a future highway, you know the real price of something untouchable.
So the rights of authors, creators, inventors…
Orson Welles was gobbled up by the big studios, but Kubrick carefully stayed independent of them. The law made to measure by Uncle Picsou matters little. Over time, small mammals have eaten tyrannosaurs. And we've cut off the heads of kings, who supposedly drew their power from the gods. And we did that more quickly.
"To give a purer meaning to the words of the tribe", Stéphane Mallarmé wrote. And when the credit cards have won (apparently in three years time), we must invent other ways to take us to another Cape of Good Hope, where we can watch "new stars rise from the distant horizon", like J.M. de Heredia.
= How do you see the growth of a multilingual Web?
Your book (which is really good and useful — I get something out of it every time I read it, and it has good addresses too) deals with this whole subject: "Sooner or later the presence of languages on the Web will reflect their strength around the world." Depending on the energy of those who speak them.
= What is your best experience with the Internet?
How light-headed we felt when we received our first message… coming from
Canada. 10.000 (?) years after the Inuits, our cutters had just discovered
America!
= And your worst experience?
All the sleep I'm missing…
*Interview of June 25, 2000 (original interview in French)
= How did using the hyperlink change your writing?
Surfing the Web is like radiating in all directions (I'm interested in something and I click on all the links on a home page) or like jumping around (from one click to another, as the links appear). You can do this in the written media, of course. But the difference is striking. So the Internet didn't change my life, but it did change how I write. You don't write the same way for a website as you do for a script or a play.
But it wasn't exactly the Internet that changed my writing, it was the first model of the Mac. I discovered it when I was teaching myself Hypercard. I still remember how astonished I was during my month of learning about buttons and links and about surfing by association, objects and images. Being able, by just clicking on part of the screen, to open piles of cards, with each card offering new buttons and each button opening onto a new series of them. In short, learning everything about the Web that today seems really routine was a revelation for me. I hear Steve Jobs and his team had the same kind of shock when they discovered the forerunner of the Mac in the laboratories of Rank Xerox.
Since then I've been writing directly on the screen. I use a paper print-out only occasionally, to help me fix up an article, or to give somebody who doesn't like screens a rough idea, something immediate. It's only an approximation, because print forces us into a linear relationship: the words scroll out page by page most of the time. But when you have links, you've got a different relationship to time and space in your imagination. And for me, it's a great opportunity to use this reading/writing interplay, whereas leafing through a book gives only a suggestion of it — a vague one because a book isn't meant for that.
BRIAN KING
#Director of the WorldWide Language Institute, who initiated NetGlos (The
Multilingual Glossary of Internet Terminology)
One of the WorldWide Language Institute's projects is NetGlos (The Multilingual Glossary of Internet Terminology), which is currently being compiled from 1995 as a voluntary, collaborative project by a number of translators and other professionals. Versions for the following languages are being prepared: Chinese, Croatian, English, Dutch/Flemish, French, German, Greek, Hebrew, Italian, Maori, Norwegian, Portuguese, and Spanish.
*Interview of September 15, 1998
= How did using the Internet change the life of your organization?
Our main service is providing language instruction via the Web. Our company is in the unique position of having come into existence because of the Internet!
= How do you see the growth of a multilingual Web?
Although English is still the most important language used on the Web, and the Internet in general, I believe that multilingualism is an inevitable part of the future direction of cyberspace.
Here are some of the important developments that I see as making a multilingual
Web become a reality:
1. Popularization of information technology
Computer technology has traditionally been the sole domain of a "techie" elite, fluent in both complex programming languages and in English — the universal language of science and technology. Computers were never designed to handle writing systems that couldn't be translated into ASCII (American standard code for information interchange). There wasn't much room for anything other than the 26 letters of the English alphabet in a coding system that originally couldn't even recognize acute accents and umlauts — not to mention nonalphabetic systems like Chinese.
But tradition has been turned upside down. Technology has been popularized. GUIs (graphical user interfaces) like Windows and Macintosh have hastened the process (and indeed it's no secret that it was Microsoft's marketing strategy to use their operating system to make computers easy to use for the average person). These days this ease of use has spread beyond the PC to the virtual, networked space of the Internet, so that now nonprogrammers can even insert Java applets into their webpages without understanding a single line of code.
2. Competition for a chunk of the "global market" by major industry players
An extension of (local) popularization is the export of information technology around the world. Popularization has now occurred on a global scale and English is no longer necessarily the lingua franca of the user. Perhaps there is no true lingua franca, but only the individual languages of the users. One thing is certain — it is no longer necessary to understand English to use a computer, nor it is necessary to have a degree in computer science.
A pull from non-English-speaking computer users and a push from technology companies competing for global markets has made localization a fast growing area in software and hardware development. This development has not been as fast as it could have been. The first step was for ASCII to become Extended ASCII. This meant that computers could begin to start recognizing the accents and symbols used in variants of the English alphabet — mostly used by European languages. But only one language could be displayed on a page at a time.
3. Technological developments
The most recent development is Unicode. Although still evolving and only just being incorporated into the latest software, this new coding system translates each character into 16 bytes. Whereas 8 byte Extended ASCII could only handle a maximum of 256 characters, Unicode can handle over 65,000 unique characters and therefore potentially accommodate all of the world's writing systems on the computer.
So now the tools are more or less in place. They are still not perfect, but at last we can at least surf the Web in Chinese, Japanese, Korean, and numerous other languages that don't use the Western alphabet. As the Internet spreads to parts of the world where English is rarely used — such as China, for example, it is natural that Chinese, and not English, will be the preferred choice for interacting with it. For the majority of the users in China, their mother tongue will be the only choice.
There is a change-over period, of course. Much of the technical terminology on the Web is still not translated into other languages. And as we found with our Multilingual Glossary of Internet Terminology — known as NetGlos — the translation of these terms is not always a simple process. Before a new term becomes accepted as the "correct" one, there is a period of instability where a number of competing candidates are used. Often an English loanword becomes the starting point — and in many cases the endpoint. But eventually a winner emerges that becomes codified into published technical dictionaries as well as the everyday interactions of the nontechnical user. The latest version of NetGlos is the Russian one and it should be available in a couple of weeks or so (end of September 1998). It will no doubt be an excellent example of the ongoing, dynamic process of "russification" of Web terminology.