Some simple nuts and bolts of personalized TV (video)
© Gary Hayes 2005

Joshua Sunset 'Gary Hayes 2005Thought it would be interesting for those readers of this blog who want to get under the bonnet a little on how we are represented digitally in the wonderful world of video and audio personalization. This will not be totally under the hood, pulling apart the engine pistons, or tweaking the timing, more of a where to put the oil and water kind of thing. It will hopefully provide an accessible view of a current standard for audio video personalization at least ‘ using the TV-Anytime system. What is detailed below can work on mobile, home or pc platforms, in fact any thing that can receive av or interactive content ‘ we don’t like to discriminate here at personalize media 😉

A couple of major current open standard metadata players on the block are of course TV-Anytime and MPEG7 (yes the next one after 4 ‘ don’t ask me why, especially as the next one is 21!). Focusing on TV-Anytime we shall look at a key part, the content and consumer schemas – basically lists of tags that you can attach to media and people. When TV-Anytime’s classification and MPEG7’s user preferences (a sub part of TVA in any event) are linked together they provide the way to personalize your media experience – albeit first generation ‘ things get quite complex when you start to cross-map TVA/MPEG7 with other metadata systems or across devices ‘ but that’s another story a little more on this at the bottom.

If you’ve switched off already then perhaps jump to another post but if you are still what follows is the more technically interesting insight. So lets go under the bonnet at the highest level (you just lifted up the hood and working out what is where), there are two main bits. Content metadata and Consumer metadata.

Describing Content with metadata
Looking at the Content area first it has two main elements – Creation (plus Creation Information) and Classification. For personalization to work every piece of content needs to have a bunch of metadata attached. This includes the first type which is the ‘creation ds’ lots of info about who made it, its purpose, genre, form, target, when it was made or altered, who is in it and so on. The second bit the ‘classification ds’ is a 9 (yes 9!) way high-level matrix of every imaginable way to describe linear and interactive content (Link to some of the standards docs – enter TS102822 into the box). As an aside I mostly wrote the interactive format bit of it way back when ‘ time will tell if it stands up to scrutiny!. Anyway the classification dictionary contains the following elements (I know the guide uses the word ‘programmes’ a lot but think of these are brands or properties as well):

– Intention – The primary apparent intention of programme
– Format – This dimension is used to classify programmes as to their formal structure, in other words: how does the programme look, regardless of the subject with which the programme is dealing.
– Content – This dimension is used to classify programmes according to their content or subject. Unlike in the case of the form dimension, it is essential to hear the programme.
– Commercial product – This dimension is used to classify programmes according to their content or subject. Unlike in the case of the form dimension, it is essential to hear the programme.
– Intended Audience – Programme intended for special audiences defined by age, cultural/ethnic background, profession etc.
– Origination – The original distribution method or platform for the content
– Content Alert ‘ Category dimension that alert to potential disturbing material
– MediaType ‘ This dimension classifies the media components of the work, as audio video, text, graphics etc: and enhancements
– Atmosphere – The most culturally and individually subjective area but a series of adjectives such as compelling or heart-warming, that describe the work

OK ‘ still with me. Each of the above dimensions in some cases may have up to 700 individual possible entries and that combined with the creation/s ds each piece of content could have one to many hundreds of tags attached. Here is a quote from the metadata part of the spec (which may or may not help?)

‘In multi-dimensional classification systems each content item is usually classified as many times as there are dimensions in the system. A multi-dimensional classification system can be understood as a way to describe a content item according to several coordinates in a multi-dimensional space.
In such a multi-dimensional classification system each content item is potentially classifiable in each of the dimensions used ‘ i.e. each dimension is applicable to every program or commercial.
Each dimension is used to describe content from a single viewpoint. Classification of a program in one specific dimension may not, by itself, be meaningful. In most cases, it is only the combination of classification terms drawn from multiple dimensions that leads to significance.
Each dimension is structured in a hierarchical way to enable greater precision and flexibility in the description of the aspect involved.’

To put things in real world perspective lets choose a simple example and only fill in a few fields to help you out here:

Let’s choose a film everyone knows, Artificial Intelligence, the Spielburg one about the robot kid who wanted, like pinnochio, to become real ‘ too much detail Gary. Some typical metadata (only one per dimension ‘ you can have many more) that would describe this:
– Intention ‘ 1.1.1 Pure Entertainment
– Format ‘ 2.2.1 Fictional portrayal of life
– Content ‘ Science Fiction
– Commercial product ‘ as it is not advertising anything no field’
– Intended Audience ‘ 4.1 General Audience
– Origination ‘ 5.3 Cinema Industry Originated Movie
– Content Alert ‘ 6.3.4 Deliberate killing of human beings (ok in one scene they are robots but to a kid?)
– MediaType ‘ 7.1.3 Audio and Video
– Atmosphere – Could choose about 10 or so here but lets for now just have 8.29 Insightful

In simple terms these numbers are placed into typical nested XML schema ‘ if you have never seen one of these then don’t worry, but heres a little taster ‘ a lot of code to basically say this is ‘pure entertainment’.


Describing Consumers with metadata
Right all the above was simply an appetiser for the main course. Once all this content is described how can we match it to us humans. There are two ways:

1 ‘ It learns the stuff you are watching/doing
2 ‘ You give it a start by telling it what you like (or change or tell it what you like later)

So TVA/MPEG7 has two consumer metadata areas to deliver this and build profiles:

1 – Usage History ‘ UserActionHistory, UserActionList, UserAction
2 – User Preferences ‘ UserPreferences, UserIdentifier, FilteringAndSeachPreferences, CreationPreferences, ClassificationPreferences, SourcePreferences, PreferenceConditions, BrowsingPreferences, SummaryPreferences

As you may have spotted the above and the detail within (if you look at some of the docs linked above) cover many but not all of what us humans do with out media ‘ but it is a start! At the most basic level’ a user can build their ‘user preferences’ say they ‘really like’ Science Fiction movies by having a value of 100 (mpeg7 up values go from ‘100 to +100 ‘ more on this later!) into the ClassificationPreferences field, tagged with (remember from above?). Well they don’t enter all the code of course ‘ they may just press the colour blue which means ‘like’! Simple stuff. On the automatic profiling side of the fence if the only thing a person watches are films with Jennifer Lopez in then an implementation (a service built using the standard) could over time set the UsageHistory to 100 or 10 or ’40, all dependent. Given the granularity of the specification an implementation could watch ALL detail and then anything is possible. Simple examples:

1 – tracks that you always change channels when Russel Crowe comes on (if Mr. Crowe is time-tagged in several films sof course)
2 ‘ tracks that you always take part in socialist political ( chat forums (
3 – notices that you watch twice as much rowing ( as west coast hip hop (
4 ‘ In fact a billion plus alternates

The above examples of course depend on implementation of the basic tools ‘ the person who invented the wheel in not responsible for drive by shootings (you get the metaphor? ‘ in terms of invasiveness of tracking). Anyway I hope you are more familiar with the basic principles/technologies at work here ‘ we have a meld of content and consumer metadata that tracks viewing habits ‘ first day at ‘agent’ creation school.

Interoperability ‘ how we can make it work across devices, time and with new content/services
There are a few interesting dilemmas arising from the above but I will leave the detail for later. But as with any technology the basic form, say the car, can be built in many ways and have many varying non-interchangeable parts ‘ ever tried fitting your Toyota with a Ford alternator? So getting your tastes mapped onto other devices or having yourself tracked on another system may not be without problems. Without strong guidance from test bodies and first to market implementers (similar to the wonderful way open source apps are being developed at the moment) we could get to a point where all of this may be irrelevant ‘ apart from one system at a time. More later.

Here is one example to get you thinking (commenting on this blog?) about implementation and interoperability issues from the basic metadata intro above:
EG: Preference strength and cross mapping. As the Mpeg7 prefs scale goes from ‘100 to +100 if one provider has a ‘don’t like (-100)’ ‘not bad (0)’ ‘like (+100)’ selection system (three choices) and builds your profile up this way ‘ will that make sense on the next system that may have ten levels of choice (from bad to great) and they have decided to go for a 0-100 only implementation only. How do you map an ‘OK’ on one system to ‘Average’ on another ‘ the type of thing that keeps those system integrators happy for years to come.

There will be more issues to do with this coming up in future links if I go a little tecchy again – just thought I would throw this in to get the uninitiated acclimatized a little to the background workings of Personal TV (primarily), personalization systems.

Posted by Gary Hayes ©2005