Natural language queries

Appropos of nothing, I came across this database file of natural-language queries that were typed into some search engine and that make for some good time-wasting reading. The data is from a project called Webclopedia, from the Natural Languages group at the Information Sciences Institute of the University of Southern California, which is “intended to answer questions posed to it in various languages, drawing its answers from text collections and/or the web, from multiple languages.” The question databases were drawn from 27,000 questions submitted to answer.com’s search engine. (Answer.com is no longer around, but here’s their 1999 site courtesy of the Internet Wayback Machine).

I use to have this ticker on my desktop, from where I don’t remember (maybe Lycos?), that displayed a continual stream of search terms that were being entered in real time into their search engine. I found it fascinating to just stare at this thing and see how many “Jennifer Lopez” queries would go by in 5 minutes. It was like having a pulse on the impulses of millions of web surfers. But how I’d love to have a ticker for something like Ask Jeeves, with natural-language questions flying by. Rather than viewing mere impulses, I could feel like I had a window onto strangers simultaneously seeking answers to life’s questions en masse.

Here’s are some queries from the above-mentioned database that amused me:

— Where do the earthworms go when it is real dry ground? It seems you can dig forever and not find one.
— Where is a really good sports [web] site?
— Where can I buy a hat like the kind Jay Kay from Jamiroquai wears?
— If knowledge is power, and power corrupts, how can humankind survive?
— If you hover over the international date line in a helicopter, will time stand still?
— Why are there Braille things on the ATM Machines that people drive through?
— Are your eyeballs the same size throughout your whole life?
— Are you supposed to put your tongue in the other person’s mouth when French kissing?
— How do you send a movie transcript to Hollywood so that it can be produced into a film?
— What are the four railways in Monopoly?

On second thought, maybe I don’t want to see that ticker after all. Reading the various questions I got the feeling that all I would really be doing would be peering into the minds of children doing their homework and adults playing trivial pursuit.

First day of Japanese class

Yesterday I had my first lesson in the 3-month Japanese course I’m taking, and all in all I came away with positive impressions about the school and class, and very excited about my prospects for progressing further in my never-ending (it seems) study of this language.

I chose this particular school in part because it seemed that with class levels higher than rank beginners, enrollment was quite low (when I visited the school a month ago, the class I sat in on briefly had just one student, in what amounted to a private one-on-one tutorial setup at a group class rate!). And so I was heartened to find that there are only two other students in my particular class. Today, only one of the other two students showed up and so it was just me and a Canadian woman (who ironically works for the same English conversation school company that I do, although at a different branch). So you can imagine that there is ample time to speak in Japanese, or as those of us in ESL say, ample amounts of student-talking-time (STT) and a correspondingly smaller amount of teacher-talking-time (TTT).

One of the results of my mish-mash of Japanese classes over the years is that I’ve learned grammatical structures all over the place, depending on the school and the text used. This class I’m now taking is convering the second half of Japanese for Busy People II, which is where I was told I was at when I took a level check last month. However, I’ve only studied some of the structures covered in the first half of the book so I was a bit hesitant as to whether the level was too high for me. But based on this first class, which was mainly a review of certain key structures from the first half of the book, I think I’ll be alright, as long as I continue to apply myself and study consistently. I was also heartened to confirm for myself that when push comes to shove, when I have to speak in Japanese and in the company of patient people, I say a lot more than I sometimes think I can say.

At the moment, I’m actually a bit more worried about the impact of these classes on my life and my health. I don’t really have enough time to go home after class and before work starts, and so I’m forced to stay out all day, which means leaving at 8 in the morning and returning home after 10. It’d be one thing (though still not pleasant) to be at one place all that time, but having to commute an hour into Tokyo in the morning, find a place to kill time in the afternoon, and then commute an hour and a half to my work place…well, let’s just say that I was exhausted by the time I got to work.

This to say nothing of the packed-like-sardines train in the morning, having one’s body pushed and pulled and squeezed for 40 minutes on the most crowded commuter train in the Tokyo metropolitan area (and that’s saying a lot!) is no picnic, and doing that three times a week is not something I look forward to. Of course, my little whining is about nothing compared to those who do all of the above every day of the week, for year in and year out.

National Diet Library archives

I was going to write a longer parenthetical comment in my previous post about this archive, but I was so taken with it that I thought it deserved its own post. Now I may be on to something the rest of the world knows about, but I was quite shocked at coming upon the Rare Books Image Database at the National Diet Library site (the latter links to the library’s English site). The image database is in Japanese, but with a little effort (or just plain luck clicking on various links) you’ll soon be captivated by an astonishing array of prints, most of them from the Edo Period. If you’re at all interested in Japanese art, particularly ukiyo-e woodblock prints, you really should take a look at this archive. The source material isn’t the most pristine, and the scans aren’t the greatest, but they’re relatively large (usually around 200K) and adequate enough. According to the Library, there are almost 31,000 images in the database. Let me repeat: 31,000 images! I ran a search on Hiroshige, for example, and there are almost 1,500 works of his in the database. (To be fair, a good share of the database’s images are scans of texts from various rare books, which may or may not appeal to you).

To get to the ukiyo-e prints, from the main image database page, click on the purple button in the main menu navigation to be taken to the nishiki-e (color woodblock print) section. If you don’t feel comfortable (of have the capability with your computer setup) searching on Kanji keywords, try this page which will allow you to browse by clicking on Hiragana characters in the left margin.

I could spend hours just clicking through this virtual gallery. Hell, what am I talking about, I already have!