PDA

View Full Version : Clustering: Describing Multitudes


Zell
09-13-2000, 11:46 PM
More fundamental than descriptions of noise and conversation levels is that of physical items: landscapes, buildings, people, junk -- anything. As we move around in the virtual world and focus our attention on various things using text commands or mouse clicks or whatnot, it is the duty of the server to compile informative and elegant descriptions of whatever we are looking at.

When we walk into a room, we want a brief overview of where we've arrived, a description of our first impressions; when we study a sword hilt closely we want a minutely detailed description of what the hilt looks like and anything about it that may be out of the ordinary. As we design algorithms to compile these descriptions, one of the most common problems turns out to be to compile a composite description of a set of distinct physical items. We call this our clustering system, and this posting is a peek at a dilemma with which I'm wrestling this week.

It gets pretty technical... but hey, it's The Tech forum.

So, we're given a set of items -- anything; thirtythree pieces of bubblegum; half a dozen historical warlords stuck in a cage at the zoo; a piece of bread and a sword and an onion; the engraving on a ring and a carrot. We call these items details. Our task is to produce an English noun phrase that describes the set of details -- in fact, a description very much like the ones in my previous sentence.

If the details are a sword, a sword, a sword and a shield, then the proper description may be 'three swords and a shield', which requires a pluralization of 'sword'. If we also know that two of those swords are rapiers and one is a cutlass, then perhaps the better description is 'two rapiers, a cutlass and a shield'.

Now, for our purposes, each of these details is fully described by, A) a list of names -- singular and plural nouns, B) a list of adjectives, C) a noun phrase called the 'brief' of the item, and D) a flag signifying wether or not the 'brief' constitutes a proper noun such as e.g. 'Taj Mahal', as opposed to 'long sharp thin steel sword'.

This is a problem that's been solved to various degrees of success in the industry; Skotos' current proof-of-concept algorithm is somewhat flawed but quite ambitious. It first finds the most common shared name in the set -- 'sword' in our simple example above -- and outputs the share count and the plural form of the name -- 'three swords' -- then again applies the algorithm on the remains of the set after removing the accounted-for details.

This way broad categorizations always take precedence which sometimes works well -- as in the case of 'three swords and a shield' -- but sometimes less so, such as in the case of 'nineteen primates and two dogs' where 'eleven humans, eight chimpanzees and two dogs' would have been much more elegant.

So that's an introduction to the dilemma; I'll post later on what looks like viable improvements. Meanwhile, any comment is more than welcome.

Zell

Buggy
09-14-2000, 07:02 AM
Well, this isn't so much as clustering but cluttering.

Let's say that as a storybuilder, I decide to make a nice sitting room. I put in a fine oriental rug, a number of elegant leather chairs, some tapestries on the walls, a chandelier and an endtable. On the endtable, I put a humidor containing a number of fine cigars, an intricate lighter, and a book of prose.

Now, the game designer in me says, "I want the players to be able to identify and interact with every object in the room. They should be able to smell the tobacco; examine the plush, red velvet on the chairs; and swing on the chandelier, rainging crystals down on the other players."

The game builder in me says, "can you imagine how much work that will be? The more detail, the more typing!" It also says, "well, can the skotos system handle hundreds -- perhaps thousands -- of rooms, each filled with dozens of intracately designed objects?"

So, where is the balance struck? If I make, say, the rug part of the room description, then the international rug cartel cannot steal it. But if I make everything an object, the details will drive me insane and -- probably -- clutter peoples' screens.

I realize this only slightly touches on tech, but it's a concern.

Thanks,

Buggy

JeffCrook
09-14-2000, 09:32 AM
Originally posted by Buggy:
[B]
So, where is the balance struck? If I make, say, the rug part of the room description, then the international rug cartel cannot steal it. But if I make everything an object, the details will drive me insane and -- probably -- clutter peoples' screens.
[B]

I was wondering the same thing. Especially what Queen Vivienne will do about the rug war. We must keep illegal rugs out of the hands of children. Because where will they lead? Yes, today it might be okay for little Johnny to play on his rug, but what about tomorrow? He won't be satisfied with rugs, he'll want carpets, and then probably throw pillows and (gasp!) a BEAN BAG CHAIR. Think what that will do to his posture!

Actually, though, I do have the same concerns as Buggy concerning detailing objects. I mean, you could go quite insane, mad I tell you, putting things in drawers and cabinets and under beds and and and rugs! "I've been working on this stage for six months and I've finally finished the first room." Could you imagine trying to do Holmes' rooms at 221B Baker Street?

So anyway, as for the algorithm question, it seems a sticky problem in logic. I hadn't thought of that, but as players move about in a stage or theatre, they are bound to move things around, leave their brandy glasses on the mantel, and so forth. Those objects would need to be accounted for. Especially in something like a Holmes' mystery stage. One clue might be that the brandy in the glass on the mantle. An algorithm would need to keep track of when the brandy has completed evaporated, so that when our sleuth investigates it, he can say that the murder happened six hours ago because the brandy has harldy evaporated at all, or three days ago because the glass is dry and stained.

Oh Lord, talk about a mess of details. It boggles the mind.

Deimos
09-14-2000, 11:05 AM
Originally posted by Zell:
This way broad categorizations always take precedence which sometimes works well -- as in the case of 'three swords and a shield' -- but sometimes less so, such as in the case of 'nineteen primates and two dogs' where 'eleven humans, eight chimpanzees and two dogs' would have been much more elegant.

You might want to have a look-see at the Summer 2000 issue of The Perl Journal (#18), which has an article on Lingua::Wordnet which in part handles exactly this sort of problem (how are groups of objects related, how far can we generalize before we start talking strangely about "multi-celled organisms" instead of "two dogs, a cat, a monkey and a toad").

See:

http://www.cogsci.princeton.edu/~wn/ (Wordnet home)
http://www.tpj.com/ (Perl Journal)
http://www.perl.com/CPAN-local/README.html (CPAN)


Cheers

Buggy
09-14-2000, 11:09 AM
Originally posted by JeffCrook:

So anyway, as for the algorithm question, it seems a sticky problem in logic. I hadn't thought of that, but as players move about in a stage or theatre, they are bound to move things around, leave their brandy glasses on the mantel, and so forth. Those objects would need to be accounted for. Especially in something like a Holmes' mystery stage. One clue might be that the brandy in the glass on the mantle. An algorithm would need to keep track of when the brandy has completed evaporated, so that when our sleuth investigates it, he can say that the murder happened six hours ago because the brandy has harldy evaporated at all, or three days ago because the glass is dry and stained.

Oh Lord, talk about a mess of details. It boggles the mind.



Well, if I remember my college classes of ten years ago, there are some fine standard differential equations for figuring that stuff out.

Luckily, this is a real time, system, so if we know the rate of evaporation, we can have the computer evaoprate a little bit out every so often.

Not as elegent a solution as using diff eq, but there it is.

Of course that does make one wonder: can things happen in rooms of Skotos when nobody is there? Can I set a bomb to go off at midnight even if there's nobody in the room? That'll teach those beanbag thieving miscreants!

Zell
09-14-2000, 01:11 PM
Originally posted by Buggy:

Now, the game designer in me says, "I want the players to be able to identify and interact with every object in the room. They should be able to smell the tobacco; examine the plush, red velvet on the chairs; and swing on the chandelier, rainging crystals down on the other players."

The game builder in me says, "can you imagine how much work that will be? The more detail, the more typing!" It also says, "well, can the skotos system handle hundreds -- perhaps thousands -- of rooms, each filled with dozens of intracately designed objects?"


The system can certainly handle it. Right now you can add any number of static detail you desire, and within the month we'll finish the event system, which is essentially the developer's ability to hook logic into things that happen in the room -- smelling the tobacco or, more complexly, the chandelier swing. Both will be quite doable.

We're trying to blur the line between details (part of the room) and objects (self-sufficient) as much as possible. We've done a fair bit to get there -- details are treated internally almost precisely the same way as full objects. The big different is that details cannot be portable. Later we have hopes of blurring even that, so that details can be instantiated into full objects at the moment they are picked up.

The event system will work both for details and objects.


So, where is the balance struck? If I make, say, the rug part of the room description, then the international rug cartel cannot steal it. But if I make everything an object, the details will drive me insane and -- probably -- clutter peoples' screens.


The clustering system should help with the clutter, and we hope to give you better and better tools for it over the course of the fall. We have a prototype system that lets you create, say, eight torches, place them in different proximities in a room, and assign a 'group identifier' of a sort to them which has associated with it an English sentence that is then made part of the description.

In other words, instead of listing the torches as part of the shopping-list of objects present in the room, the room description would be appended with the sentence you write, e.g. "The light of eight torches flicker eerily over the surface of the stone walls." The sentence uses SAM, our active markup system, and so can include dynamic data to count the number of torches so that even if I were to take one of the torches it could say 'seven' instead of 'eight'. And so on. This system would put a lot of descriptive power into the hands of the developer -- no promises, though, this is at an early stage.

As to how much effort you want to put into each room, well, that's entirely up to you. I've played carefully crafted games that take place in as few as ten rooms, where over the course of the plot the rooms slowly reveal more hidden detail and intricacy. I've played other games with huge landscapes of simple but beautifully described rooms. Both approaches give great joy if done well.

Zell

Zell
09-14-2000, 01:18 PM
Originally posted by Buggy:
Well, if I remember my college classes of ten years ago, there are some fine standard differential equations for figuring that stuff out.

Actually I believe evaporation is measured in mass per square area per time so as long as the glass is roughly cylindrical the rate is independent of the amount of liquid remaining. If we take into account the curvature of the brandy glass, however, I suppose the solution on a closed form would require the solution of a differential equation. However...


Luckily, this is a real time, system, so if we know the rate of evaporation, we can have the computer evaoprate a little bit out every so often.

Of course that does make one wonder: can things happen in rooms of Skotos when nobody is there? Can I set a bomb to go off at midnight even if there's nobody in the room? That'll teach those beanbag thieving miscreants!

... precisely, it'd suffice to send a little heartbeat through the room every five minutes or so and evaporate another puff (we could even calculate the average area of the surface of the liquid and get a decent approximation).

Things most certainly can happen when nobody is there! In fact, for the purposes of the virtual reality simulations, players are no different from other living bodies and those in turn are not really very different from any kind of object. The world putters away happily all on its own.

Zell

JeffCrook
09-15-2000, 12:08 AM
Originally posted by Buggy:
Of course that does make one wonder: can things happen in rooms of Skotos when nobody is there? Can I set a bomb to go off at midnight even if there's nobody in the room?

Gosh, I sure hope so.

Malichor
09-27-2000, 05:19 PM
Or better yet, make the the room "look" for attempts to take objects described in it..
just like ask the seamstress for a pelican


the seamstress would generate a pelican..., that way the room isnt cluttered.. but the room can generate, the item... change the description ( omitting the objects presence)

and a repeated "take" of the item would either get another object.. if theres one still available.... and make the room reset to its original desc when a) a certain amount of time has passed
or
b) a housekeeping cnpc has come and restocked it ( maid or whatnot)

Atama
10-23-2000, 08:45 PM
Wow, I must say, are YOU guys ambitious!

When I was younger, and wanted to be a game designer (this is before I realized that too much coding drives me insane) this is exactly the kind of anal and immersive detail I wanted to do.

I'm sticking around for the whole beta, and if things are anywhere near as great as you guys are planning it, I will probably stick around for the long haul!

I say kudos to all the developers of this game. This seems to me to be the most tremendous effort I have ever heard of in an online RPG (non-graphics-based at least).

jwp
10-23-2000, 11:16 PM
With regard to "clustering" objects in descriptions, people don't generally "count" more than about three items when they initially see a group of the same kind of objects. It's sensible to say "There are three roses here." but not "There are 427 roses here." For objects within "normal" size ranges (acorns, say, up to large dogs) it seems to work best to enumerate any number up to three or four, use "several" for four or five up to around ten, "many" from ten to twenty or thirty, etc. The brain treats actually seems to treat non-numerical quantifiers based on both size and degree of noticeability in the environment, but generally using size alone will work. " A few acorns" could be anything from four or five up to maybe twenty, while "many acorns" might mean anything up to fifty or a hundred of them depending on the environment; whereas "many saint bernards" is probably less than ten.

With respect to the "19 primates" problem, the most acceptable solution I ever found was to stop the backtracking at "genus" or "species". Species is generally best, since "19 pole arms" isn't as useful as "14 halberds and 5 pikes", though it tends to vary with exactly what we're dealing with (e.g., "swords" is often sufficient without breaking it into "long swords", "short swords", etc, until the user actually looks at the swords).